Odessa City

Setup

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.3     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(janitor)

Attaching package: 'janitor'

The following objects are masked from 'package:stats':

    chisq.test, fisher.test
library(readxl)
library(lubridate)

Import

Same process as before

odessa <- read_excel("data-raw/OdessaCity.xlsx")

odessa |> glimpse()
Rows: 2,974
Columns: 8
$ date_arrest    <chr> "1/1/2018", "1/1/2018", "1/2/2018", "1/2/2018", "1/3/20…
$ name           <chr> "Prnce, Cameron", "TALLEY, JESSI MARIE", "Hendrix, Jess…
$ race           <chr> "B - Black or African American", "W - White", "W - Whit…
$ ethnicity      <chr> "N - Not Hispanic or Latino", "N - Not Hispanic or Lati…
$ sex            <chr> "M - Male", "F - Female", "F - Female", "M - Male", "M …
$ `Case No`      <chr> "18-0000022", "18-0000018", "18-0000091", "18-0000083",…
$ charges        <chr> "35620008 - POSS MARIJ 2OZ", "35620008 - POSS MARIJ 2OZ…
$ address_arrest <chr> "1900 Blk. E. 6th", "1st/Sam Houston", "Carolyn/ Andrew…

Clean

Need to change date_arrest to date type. Need to get rid of extra words in race and sex. And fet rid of Case No altogether.

odessa_clean <- odessa |> mutate(
  date_arrest = mdy(date_arrest),
  race = substr(race, 1, 1),
  sex = substr(sex, 1, 1)
) |> select(-"Case No") |> 
  cbind(datetime_arrest = NA, agency_arrest = NA, age = NA)
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `date_arrest = mdy(date_arrest)`.
Caused by warning:
!  2 failed to parse.
odessa_clean |> glimpse()
Rows: 2,974
Columns: 10
$ date_arrest     <date> 2018-01-01, 2018-01-01, 2018-01-02, 2018-01-02, 2018-…
$ name            <chr> "Prnce, Cameron", "TALLEY, JESSI MARIE", "Hendrix, Jes…
$ race            <chr> "B", "W", "W", "W", "B", "W", "W", "W", "W", "W", "B",…
$ ethnicity       <chr> "N - Not Hispanic or Latino", "N - Not Hispanic or Lat…
$ sex             <chr> "M", "F", "F", "M", "M", "M", "F", "M", "M", "M", "F",…
$ charges         <chr> "35620008 - POSS MARIJ 2OZ", "35620008 - POSS MARIJ 2O…
$ address_arrest  <chr> "1900 Blk. E. 6th", "1st/Sam Houston", "Carolyn/ Andre…
$ datetime_arrest <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ agency_arrest   <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ age             <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…

Export

odessa_clean |> write_csv("data-processed/Odessa-City.csv")