Amarillo City

Setup

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.3     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(janitor)

Attaching package: 'janitor'

The following objects are masked from 'package:stats':

    chisq.test, fisher.test
library(readxl)
library(lubridate)

Import Excel Sheet

First, we will start with Amarillo City data to test out the process.

amarillo <- read_excel("data-raw/AmarilloCity.xlsx")

amarillo

It worked! read_excel() is the best!

Clean up columns

Now I want to clean up the columns so the date and times are separate and there is no extra columns. First change the date/time to a date type.

amarillo_date <- amarillo |> mutate (
  datetime_arrest = mdy_hm(datetime_arrest)
)

amarillo_date |> glimpse()
Rows: 649
Columns: 5
$ `Incident #s`   <chr> "20220521758", "20220521183", "20220521123", "20220521…
$ datetime_arrest <dttm> 2022-12-21 15:39:00, 2022-12-16 18:37:00, 2022-12-11 …
$ name            <chr> "GUEVARA, LAURA", "TOLER, NICHOLAS DON", "CARRILLO SAL…
$ charges         <chr> "POSS MARIJ < 2OZDRUG PARA-POSS CL C", "WARRANT-HOLD F…
$ address_arrest  <chr> "7200 SW 34TH AVE, AMAR", "SE 10TH AVE / ROSS ST, AMAR…

Done!

Now I want to get rid of the Incident #’s column and make the date and time separate columns.

amarillo_clean <- amarillo_date |>  mutate(
  date_arrest = date(datetime_arrest)) |> select(
  -'Incident #s') |> cbind(
    age = NA, race = NA, sex = NA, agency_arrest = NA,  ethnicity = NA)

amarillo_clean |> glimpse()
Rows: 649
Columns: 10
$ datetime_arrest <dttm> 2022-12-21 15:39:00, 2022-12-16 18:37:00, 2022-12-11 …
$ name            <chr> "GUEVARA, LAURA", "TOLER, NICHOLAS DON", "CARRILLO SAL…
$ charges         <chr> "POSS MARIJ < 2OZDRUG PARA-POSS CL C", "WARRANT-HOLD F…
$ address_arrest  <chr> "7200 SW 34TH AVE, AMAR", "SE 10TH AVE / ROSS ST, AMAR…
$ date_arrest     <date> 2022-12-21, 2022-12-16, 2022-12-11, 2022-12-10, 2022-…
$ age             <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ race            <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ sex             <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ agency_arrest   <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ ethnicity       <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…

How do I get the date separated??? Try pasting the separate pieces together.

Export

amarillo_clean |> write_csv("data-processed/Amarillo-City.csv")