This exercise provides an in-depth Geospatial Analysis of the cumulative COVID-19 confirmed cases and death rates in its sub-districts (keluruhan) using appropriate thematic and analytics mapping techniques and R functions.
This analysis aims to analyse the spatio-temporal patterns of monthly cumulative confirmed COVID-19 rate and death cases rate at keluruhan (sub-district) level of Daerah Khusus Ibukota Jakarta (DKI Jakarta). In other words, this analysis is done to detect which sub-districts had a relatively higher number of confirmed cases and death cases rates and how they changed over time. Additionally, the temporal interval will be at the last day of the month while the geographic will be at the sub-district.
The data set used for this analysis includes:
Aspatial: Daily updates of COVID-19 statistics in DKI Jakarta at https://riwayat-file-covid-19-dki-jakarta-jakartagis.hub.arcgis.com. The cumulative data is available in the “data” worksheet.
Geospatial: Shapefile (SHP) Batas Desa Provinsi DKI Jakarta provided at
PODES 2019. There are other geospatial data in ESRI shapefile format at different geographical levels available at Indonesia Geospatial. But for the purpose of this analysis, only SHP Batas Desa Provinsi DKI Jakarta will be used.
Before we proceed with the analysis, there were some issues found when downloading the aspatial dataset which we need to take note of:
Hence, the remedy is to:
This code chunk performs 3 tasks:
packages <- c('sf', 'tidyverse', 'tmap', 'kableExtra', 'ggpubr', 'devtools', 'gifski')
for(p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
devtools::install_github("gadenbuie/xaringanExtra")
library(xaringanExtra)
More on the packages used:
dkij_sf <- st_read("data/geospatial", layer="BATAS_DESA_DESEMBER_2019_DUKCAPIL_DKI_JAKARTA")
Reading layer `BATAS_DESA_DESEMBER_2019_DUKCAPIL_DKI_JAKARTA' from data source `C:\aisyahajit2018\IS415\IS415_blog\_posts\2021-09-10-take-home-exercise-1\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 269 features and 161 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 106.3831 ymin: -6.370815 xmax: 106.9728 ymax: -5.184322
Geodetic CRS: WGS 84
glimpse(dkij_sf)
Rows: 269
Columns: 162
$ OBJECT_ID <dbl> 25477, 25478, 25397, 25400, 25378, 25379, 25390, ~
$ KODE_DESA <chr> "3173031006", "3173031007", "3171031003", "317103~
$ DESA <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMPAKA ~
$ KODE <dbl> 317303, 317303, 317103, 317103, 310101, 310101, 3~
$ PROVINSI <chr> "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", "DKI~
$ KAB_KOTA <chr> "JAKARTA BARAT", "JAKARTA BARAT", "JAKARTA PUSAT"~
$ KECAMATAN <chr> "TAMAN SARI", "TAMAN SARI", "KEMAYORAN", "KEMAYOR~
$ DESA_KELUR <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMPAKA ~
$ JUMLAH_PEN <dbl> 21609, 9069, 29085, 41913, 6947, 7059, 15793, 589~
$ JUMLAH_KK <dbl> 7255, 3273, 9217, 13766, 2026, 2056, 5599, 1658, ~
$ LUAS_WILAY <dbl> 0.36, 0.37, 0.53, 0.97, 0.93, 0.95, 1.76, 1.14, 0~
$ KEPADATAN <dbl> 60504, 24527, 54465, 42993, 7497, 7401, 8971, 515~
$ PERPINDAHA <dbl> 102, 25, 131, 170, 17, 26, 58, 13, 113, 178, 13, ~
$ JUMLAH_MEN <dbl> 68, 52, 104, 151, 14, 32, 36, 10, 60, 92, 5, 83, ~
$ PERUBAHAN <dbl> 20464, 8724, 27497, 38323, 6853, 6993, 15006, 580~
$ WAJIB_KTP <dbl> 16027, 7375, 20926, 30264, 4775, 4812, 12559, 398~
$ SILAM <dbl> 15735, 1842, 26328, 36813, 6941, 7057, 7401, 5891~
$ KRISTEN <dbl> 2042, 2041, 1710, 3392, 6, 0, 3696, 0, 4058, 5130~
$ KHATOLIK <dbl> 927, 1460, 531, 1082, 0, 0, 1602, 0, 2100, 2575, ~
$ HINDU <dbl> 15, 9, 42, 127, 0, 0, 622, 0, 25, 27, 0, 9, 115, ~
$ BUDHA <dbl> 2888, 3716, 469, 495, 0, 2, 2462, 0, 4134, 4740, ~
$ KONGHUCU <dbl> 2, 1, 5, 1, 0, 0, 10, 0, 9, 10, 0, 4, 1, 1, 4, 0,~
$ KEPERCAYAA <dbl> 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 2, 0, 22, 0, 3, ~
$ PRIA <dbl> 11049, 4404, 14696, 21063, 3547, 3551, 7833, 2954~
$ WANITA <dbl> 10560, 4665, 14389, 20850, 3400, 3508, 7960, 2937~
$ BELUM_KAWI <dbl> 10193, 4240, 14022, 20336, 3366, 3334, 7578, 2836~
$ KAWIN <dbl> 10652, 4364, 13450, 19487, 3224, 3404, 7321, 2791~
$ CERAI_HIDU <dbl> 255, 136, 430, 523, 101, 80, 217, 44, 381, 476, 3~
$ CERAI_MATI <dbl> 509, 329, 1183, 1567, 256, 241, 677, 220, 1197, 9~
$ U0 <dbl> 1572, 438, 2232, 3092, 640, 648, 802, 585, 2220, ~
$ U5 <dbl> 1751, 545, 2515, 3657, 645, 684, 995, 588, 2687, ~
$ U10 <dbl> 1703, 524, 2461, 3501, 620, 630, 1016, 513, 2653,~
$ U15 <dbl> 1493, 521, 2318, 3486, 669, 671, 1106, 548, 2549,~
$ U20 <dbl> 1542, 543, 2113, 3098, 619, 609, 1081, 491, 2313,~
$ U25 <dbl> 1665, 628, 2170, 3024, 639, 582, 1002, 523, 2446,~
$ U30 <dbl> 1819, 691, 2363, 3188, 564, 592, 1236, 478, 2735,~
$ U35 <dbl> 1932, 782, 2595, 3662, 590, 572, 1422, 504, 3034,~
$ U40 <dbl> 1828, 675, 2371, 3507, 480, 486, 1200, 397, 2689,~
$ U45 <dbl> 1600, 607, 2250, 3391, 421, 457, 1163, 365, 2470,~
$ U50 <dbl> 1408, 619, 1779, 2696, 346, 369, 1099, 288, 2129,~
$ U55 <dbl> 1146, 602, 1379, 1909, 252, 318, 979, 235, 1843, ~
$ U60 <dbl> 836, 614, 1054, 1397, 197, 211, 880, 162, 1386, 1~
$ U65 <dbl> 587, 555, 654, 970, 122, 114, 747, 111, 958, 932,~
$ U70 <dbl> 312, 311, 411, 631, 69, 55, 488, 65, 554, 573, 38~
$ U75 <dbl> 415, 414, 420, 704, 74, 61, 577, 38, 717, 642, 37~
$ TIDAK_BELU <dbl> 3426, 1200, 4935, 7328, 1306, 1318, 2121, 973, 50~
$ BELUM_TAMA <dbl> 1964, 481, 2610, 3763, 730, 676, 1278, 732, 3241,~
$ TAMAT_SD <dbl> 2265, 655, 2346, 2950, 1518, 2054, 1169, 1266, 44~
$ SLTP <dbl> 3660, 1414, 3167, 5138, 906, 1357, 2236, 852, 585~
$ SLTA <dbl> 8463, 3734, 12172, 16320, 2040, 1380, 5993, 1570,~
$ DIPLOMA_I <dbl> 81, 23, 84, 179, 22, 15, 43, 36, 85, 83, 4, 63, 2~
$ DIPLOMA_II <dbl> 428, 273, 1121, 1718, 101, 59, 573, 97, 604, 740,~
$ DIPLOMA_IV <dbl> 1244, 1241, 2477, 4181, 314, 191, 2199, 357, 1582~
$ STRATA_II <dbl> 74, 46, 166, 315, 10, 8, 168, 8, 63, 92, 5, 174, ~
$ STRATA_III <dbl> 4, 2, 7, 21, 0, 1, 13, 0, 3, 9, 0, 16, 8, 7, 75, ~
$ BELUM_TIDA <dbl> 3927, 1388, 5335, 8105, 1788, 1627, 2676, 1129, 5~
$ APARATUR_P <dbl> 81, 10, 513, 931, 246, 75, 156, 160, 132, 79, 23,~
$ TENAGA_PEN <dbl> 70, 43, 288, 402, 130, 93, 81, 123, 123, 73, 45, ~
$ WIRASWASTA <dbl> 8974, 3832, 10662, 14925, 788, 728, 6145, 819, 12~
$ PERTANIAN <dbl> 1, 0, 1, 3, 2, 2, 1, 3, 2, 5, 1, 1, 0, 0, 2, 5, 2~
$ NELAYAN <dbl> 0, 0, 2, 0, 960, 1126, 1, 761, 1, 2, 673, 0, 0, 0~
$ AGAMA_DAN <dbl> 6, 6, 5, 40, 0, 0, 49, 2, 10, 11, 0, 54, 15, 16, ~
$ PELAJAR_MA <dbl> 4018, 1701, 6214, 9068, 1342, 1576, 3135, 1501, 6~
$ TENAGA_KES <dbl> 28, 29, 80, 142, 34, 26, 60, 11, 48, 55, 16, 68, ~
$ PENSIUNAN <dbl> 57, 50, 276, 498, 20, 7, 59, 14, 56, 75, 2, 97, 5~
$ LAINNYA <dbl> 4447, 2010, 5709, 7799, 1637, 1799, 3430, 1368, 7~
$ GENERATED <chr> "30 Juni 2019", "30 Juni 2019", "30 Juni 2019", "~
$ KODE_DES_1 <chr> "3173031006", "3173031007", "3171031003", "317103~
$ BELUM_ <dbl> 3099, 1032, 4830, 7355, 1663, 1704, 2390, 1213, 5~
$ MENGUR_ <dbl> 4447, 2026, 5692, 7692, 1576, 1731, 3500, 1323, 7~
$ PELAJAR_ <dbl> 3254, 1506, 6429, 8957, 1476, 1469, 3185, 1223, 6~
$ PENSIUNA_1 <dbl> 80, 65, 322, 603, 24, 8, 70, 20, 75, 97, 2, 132, ~
$ PEGAWAI_ <dbl> 48, 5, 366, 612, 223, 72, 65, 143, 73, 48, 15, 89~
$ TENTARA <dbl> 4, 0, 41, 57, 3, 0, 74, 1, 20, 12, 2, 11, 90, 340~
$ KEPOLISIAN <dbl> 10, 1, 16, 42, 11, 8, 2, 9, 17, 7, 3, 9, 165, 15,~
$ PERDAG_ <dbl> 31, 5, 1, 3, 6, 1, 2, 4, 3, 1, 4, 0, 1, 2, 9, 2, ~
$ PETANI <dbl> 0, 0, 1, 2, 0, 1, 1, 0, 1, 1, 1, 2, 0, 0, 1, 2, 0~
$ PETERN_ <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ NELAYAN_1 <dbl> 1, 0, 1, 0, 914, 1071, 0, 794, 0, 1, 663, 0, 0, 0~
$ INDUSTR_ <dbl> 7, 3, 4, 3, 1, 3, 0, 0, 1, 7, 0, 0, 2, 2, 1, 3, 1~
$ KONSTR_ <dbl> 3, 0, 2, 6, 3, 8, 1, 6, 1, 5, 10, 0, 2, 5, 7, 4, ~
$ TRANSP_ <dbl> 2, 0, 7, 4, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 6, 3, 2~
$ KARYAW_ <dbl> 6735, 3034, 7347, 10185, 237, 264, 4319, 184, 940~
$ KARYAW1 <dbl> 9, 2, 74, 231, 4, 0, 16, 1, 13, 10, 1, 24, 17, 29~
$ KARYAW1_1 <dbl> 0, 0, 5, 15, 0, 0, 0, 1, 0, 1, 0, 0, 2, 4, 7, 9, ~
$ KARYAW1_12 <dbl> 23, 4, 25, 35, 141, 50, 16, 157, 6, 9, 40, 11, 11~
$ BURUH <dbl> 515, 155, 971, 636, 63, 218, 265, 55, 1085, 652, ~
$ BURUH_ <dbl> 1, 0, 0, 0, 2, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 2, 1~
$ BURUH1 <dbl> 0, 0, 1, 0, 1, 25, 0, 2, 0, 1, 1, 0, 0, 0, 0, 0, ~
$ BURUH1_1 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ PEMBANT_ <dbl> 1, 1, 4, 1, 1, 0, 7, 0, 5, 1, 0, 6, 1, 10, 11, 9,~
$ TUKANG <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0~
$ TUKANG_1 <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ TUKANG_12 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ TUKANG__13 <dbl> 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0~
$ TUKANG__14 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ TUKANG__15 <dbl> 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0~
$ TUKANG__16 <dbl> 7, 4, 10, 14, 0, 0, 2, 0, 7, 8, 0, 8, 1, 0, 3, 2,~
$ TUKANG__17 <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0~
$ PENATA <dbl> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0~
$ PENATA_ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ PENATA1_1 <dbl> 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 2, 0, 1, 0, 0, 0~
$ MEKANIK <dbl> 11, 1, 10, 8, 0, 0, 4, 0, 7, 8, 0, 9, 0, 15, 10, ~
$ SENIMAN_ <dbl> 4, 0, 12, 28, 0, 0, 2, 0, 3, 4, 0, 9, 6, 7, 14, 1~
$ TABIB <dbl> 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0~
$ PARAJI_ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ PERANCA_ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 2, 1, 0, 1~
$ PENTER_ <dbl> 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 2~
$ IMAM_M <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ PENDETA <dbl> 2, 4, 5, 33, 0, 0, 20, 0, 10, 8, 0, 30, 14, 14, 1~
$ PASTOR <dbl> 0, 1, 0, 1, 0, 0, 8, 0, 0, 0, 0, 23, 0, 0, 0, 0, ~
$ WARTAWAN <dbl> 7, 1, 16, 27, 0, 0, 4, 0, 8, 6, 0, 9, 5, 9, 26, 3~
$ USTADZ <dbl> 6, 1, 1, 5, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0~
$ JURU_M <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0~
$ PROMOT <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ ANGGOTA_ <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 2, 1, 2~
$ ANGGOTA1 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0~
$ ANGGOTA1_1 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ PRESIDEN <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ WAKIL_PRES <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1~
$ ANGGOTA1_2 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0~
$ ANGGOTA1_3 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0~
$ DUTA_B <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1~
$ GUBERNUR <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1~
$ WAKIL_GUBE <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ BUPATI <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ WAKIL_BUPA <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ WALIKOTA <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ WAKIL_WALI <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ ANGGOTA1_4 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 1~
$ ANGGOTA1_5 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ DOSEN <dbl> 3, 2, 23, 36, 1, 2, 11, 0, 3, 5, 0, 14, 6, 28, 69~
$ GURU <dbl> 72, 40, 272, 378, 118, 72, 69, 116, 126, 71, 36, ~
$ PILOT <dbl> 1, 0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 1~
$ PENGACARA_ <dbl> 4, 1, 8, 22, 0, 0, 5, 0, 5, 4, 0, 4, 3, 12, 24, 2~
$ NOTARIS <dbl> 0, 0, 3, 5, 0, 0, 4, 0, 0, 0, 0, 5, 0, 5, 10, 3, ~
$ ARSITEK <dbl> 1, 0, 2, 3, 0, 0, 2, 0, 0, 0, 0, 4, 1, 2, 7, 3, 9~
$ AKUNTA_ <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 2~
$ KONSUL_ <dbl> 1, 0, 2, 11, 0, 0, 4, 0, 0, 0, 0, 6, 2, 3, 10, 8,~
$ DOKTER <dbl> 16, 32, 35, 68, 0, 1, 63, 0, 27, 32, 1, 63, 48, 6~
$ BIDAN <dbl> 3, 1, 9, 18, 12, 8, 1, 3, 3, 3, 7, 3, 10, 10, 7, ~
$ PERAWAT <dbl> 7, 0, 25, 44, 12, 10, 3, 6, 12, 20, 6, 7, 26, 16,~
$ APOTEK_ <dbl> 0, 0, 2, 3, 1, 0, 0, 0, 1, 2, 0, 1, 2, 3, 3, 3, 3~
$ PSIKIATER <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 3, 2, 1~
$ PENYIA_ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0~
$ PENYIA1 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ PELAUT <dbl> 0, 0, 6, 16, 1, 1, 0, 14, 2, 4, 1, 2, 4, 2, 10, 1~
$ PENELITI <dbl> 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 3, 0, 1~
$ SOPIR <dbl> 65, 3, 94, 123, 0, 1, 61, 0, 76, 79, 0, 63, 44, 1~
$ PIALAN <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ PARANORMAL <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0~
$ PEDAGA_ <dbl> 379, 126, 321, 562, 11, 10, 412, 15, 202, 225, 0,~
$ PERANG_ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ KEPALA_ <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
$ BIARAW_ <dbl> 0, 1, 0, 0, 0, 0, 22, 0, 3, 0, 0, 2, 1, 0, 4, 0, ~
$ WIRASWAST_ <dbl> 1370, 611, 1723, 3099, 131, 119, 1128, 259, 2321,~
$ LAINNYA_12 <dbl> 94, 57, 82, 122, 12, 10, 41, 6, 89, 158, 24, 37, ~
$ LUAS_DESA <dbl> 25476, 25477, 25396, 25399, 25377, 25378, 25389, ~
$ KODE_DES_3 <chr> "3173031006", "3173031007", "3171031003", "317103~
$ DESA_KEL_1 <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMPAKA ~
$ KODE_12 <dbl> 317303, 317303, 317103, 317103, 310101, 310101, 3~
$ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((106.8164 -6..., MULT~
st_crs(dkij_sf)
Coordinate Reference System:
User input: WGS 84
wkt:
GEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["latitude",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["longitude",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]]
dkij_sf23845<- st_transform(dkij_sf, crs = 23845)
st_crs(dkij_sf23845)
Coordinate Reference System:
User input: EPSG:23845
wkt:
PROJCRS["DGN95 / Indonesia TM-3 zone 54.1",
BASEGEOGCRS["DGN95",
DATUM["Datum Geodesi Nasional 1995",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4755]],
CONVERSION["Indonesia TM-3 zone 54.1",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",139.5,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9999,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",200000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",1500000,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["easting (X)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["northing (Y)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre."],
AREA["Indonesia - onshore east of 138°E."],
BBOX[-9.19,138,-1.49,141.01]],
ID["EPSG",23845]]
plot(dkij_sf23845)
tmap_mode('view')
tm_shape(dkij_sf23845)+
tm_polygons("KAB_KOTA")
Here we are switching back to plot mode since each interactive mode will consume a connection:
tmap_mode('plot')
From the interactive map above, we can see that:
dkij_5cities <- filter(dkij_sf23845, KAB_KOTA != "KEPULAUAN SERIBU")
dkij_slice<- dkij_5cities[1:9]
head(dkij_slice)
Simple feature collection with 6 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -3627698 ymin: 689029.1 xmax: -3619528 ymax: 693390.8
Projected CRS: DGN95 / Indonesia TM-3 zone 54.1
OBJECT_ID KODE_DESA DESA KODE PROVINSI KAB_KOTA
1 25477 3173031006 KEAGUNGAN 317303 DKI JAKARTA JAKARTA BARAT
2 25478 3173031007 GLODOK 317303 DKI JAKARTA JAKARTA BARAT
3 25397 3171031003 HARAPAN MULIA 317103 DKI JAKARTA JAKARTA PUSAT
4 25400 3171031006 CEMPAKA BARU 317103 DKI JAKARTA JAKARTA PUSAT
5 25390 3171021001 PASAR BARU 317102 DKI JAKARTA JAKARTA PUSAT
6 25391 3171021002 KARANG ANYAR 317102 DKI JAKARTA JAKARTA PUSAT
KECAMATAN DESA_KELUR JUMLAH_PEN geometry
1 TAMAN SARI KEAGUNGAN 21609 MULTIPOLYGON (((-3626874 69...
2 TAMAN SARI GLODOK 9069 MULTIPOLYGON (((-3627130 69...
3 KEMAYORAN HARAPAN MULIA 29085 MULTIPOLYGON (((-3621251 68...
4 KEMAYORAN CEMPAKA BARU 41913 MULTIPOLYGON (((-3620608 69...
5 SAWAH BESAR PASAR BARU 15793 MULTIPOLYGON (((-3624097 69...
6 SAWAH BESAR KARANG ANYAR 33383 MULTIPOLYGON (((-3624785 69...
From the results above, we can see that the remaining fields after slicing are as follows:
During the first try of reading the excel files, there were some data issues encountered:
Hence, when reading the excel files, we should aim to resolve these data issues first before successfully integrating the monthly data into 1 dataframe.
The following code chunk performs 6 tasks:
Read the “data” sheet of all the COVID excel files in the aspatial folder
Remove duplicated columns except the last instance
Convert tibbles to dataframe
Extract the 7 required columns from COVID aspatial data:
1. ID_KEL,
2. Nama_provinsi,
3. nama_kota,
4. nama_kecamatan
5. nama_kelurahan,
6. POSITIF (cumulative confirmed cases),
7. meninggal (cumulative death cases) from data worksheet of the daily COVID-19 data
Extract and create Month and Year columns derived from file name
Drop rows where values in ID_KEL column are not digits
aps_path = "data/aspatial"
filenames <- list.files(path = aps_path, pattern = "*.xlsx", full.names = T)
aspatial <- lapply(filenames, function(i){
data <- which(readxl::excel_sheets(i) == "data")
xl <- readxl::read_xlsx(i, sheet=data, .name_repair = "minimal")
non_dup <- xl[, !duplicated(colnames(xl), fromLast = TRUE)]
df <-data.frame(non_dup)
req_col_df <- select(df, `ID_KEL`, `Nama_provinsi`, `nama_kota`, `nama_kecamatan`, `nama_kelurahan`, `POSITIF`, `Meninggal`)
full_date <- gsub(".*\\([0-9]{2} (.*)\\).*", "\\1", i)
month <- gsub('(^\\w+)\\s.+','\\1',full_date)
year <- gsub('^.*([0-9]{4}).*','\\1', full_date)
req_col_df$Month = toupper(month)
req_col_df$Year = strtoi(year)
req_col_df$MthYr <- paste(req_col_df$Month, req_col_df$Year, sep="_")
req_col_df <- req_col_df %>% filter(grepl("[[:digit:]]", ID_KEL))
return(req_col_df)
})
covid_df <- do.call("rbind.data.frame", aspatial)
Before proceeding with our analysis, it is important to save our aspatial data as an RDS file so that we can avoid:
covid_df_rds <- write_rds(covid_df, "data/aspatial/rds/covid_df.rds")
covid_df <- read_rds("data/aspatial/rds/covid_df.rds")
glimpse(covid_df)
Rows: 4,539
Columns: 10
$ ID_KEL <chr> "3172051003", "3173041007", "3175041005", "31~
$ Nama_provinsi <chr> "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", ~
$ nama_kota <chr> "JAKARTA UTARA", "JAKARTA BARAT", "JAKARTA TI~
$ nama_kecamatan <chr> "PADEMANGAN", "TAMBORA", "KRAMAT JATI", "JATI~
$ nama_kelurahan <chr> "ANCOL", "ANGKE", "BALE KAMBANG", "BALI MESTE~
$ POSITIF <dbl> 834, 617, 755, 358, 870, 811, 1038, 1447, 898~
$ Meninggal <dbl> 9, 8, 15, 8, 13, 10, 10, 25, 12, 22, 30, 16, ~
$ Month <chr> "FEBRUARI", "FEBRUARI", "FEBRUARI", "FEBRUARI~
$ Year <int> 2021, 2021, 2021, 2021, 2021, 2021, 2021, 202~
$ MthYr <chr> "FEBRUARI_2021", "FEBRUARI_2021", "FEBRUARI_2~
We can see that there are a total of 4,539 rows in the COVID aspatial data.
dkij_slice will be used as the main table since the outer islands have already been excluded from it.
Although there are multiple fields that can be mapped like - KECAMATAN->nama_kecamatan, - DESA_KELUR->nama_kelurahan,
We will only be using KODE_DESA from dkij_slice and ID_KEL from covid_df to join since there are some discrepancies found with the other fields. Below are some examples of the discrepancies found:
The following code chunk helps us to see how we can observe these discrepancies in the columns:
KECAMATAN[!(KECAMATAN %in% nama_kecamatan)]
[1] "KALIDERES" "PAL MERAH" "SETIABUDI" "PULOGADUNG" "KRAMATJATI"
nama_kecamatan[!(nama_kecamatan %in% KECAMATAN)]
[1] "KRAMAT JATI" "PULO GADUNG" "SETIA BUDI"
[4] "PALMERAH" "KALI DERES" "KEP. SERIBU UTARA"
[7] "KEP. SERIBU SELATAN"
DESA_KELUR[!(DESA_KELUR %in% nama_kelurahan)]
[1] "KALIBARU" "KRENDANG"
[3] "RAWAJATI" "TENGAH"
[5] "BALEKAMBANG" "PINANGRANTI"
[7] "JATIPULO" "PALMERIAM"
[9] "KRAMATJATI" "HALIM PERDANA KUSUMA"
nama_kelurahan[!(nama_kelurahan %in% DESA_KELUR)]
[1] "BALE KAMBANG" "HALIM PERDANA KUSUMAH"
[3] "JATI PULO" "KALI BARU"
[5] "KAMPUNG TENGAH" "KERENDANG"
[7] "KRAMAT JATI" "PAL MERIAM"
[9] "PINANG RANTI" "PULAU HARAPAN"
[11] "PULAU KELAPA" "PULAU PANGGANG"
[13] "PULAU PARI" "PULAU TIDUNG"
[15] "PULAU UNTUNG JAWA" "RAWA JATI"
[17] "P. HARAPAN" "P. KELAPA"
[19] "P. PANGGANG" "P. PARI"
[21] "P. TIDUNG" "UNTUNG JAWA"
dki_cov <- left_join(dkij_slice, covid_df, by = c('KODE_DESA' ='ID_KEL'))
kable(head(dki_cov)) %>%
kable_styling("striped", full_width = F) %>%
scroll_box(width = "100%", height = "400px")
OBJECT_ID | KODE_DESA | DESA | KODE | PROVINSI | KAB_KOTA | KECAMATAN | DESA_KELUR | JUMLAH_PEN | Nama_provinsi | nama_kota | nama_kecamatan | nama_kelurahan | POSITIF | Meninggal | Month | Year | MthYr | geometry |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
25477 | 3173031006 | KEAGUNGAN | 317303 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 21609 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 435 | 3 | FEBRUARI | 2021 | FEBRUARI_2021 | MULTIPOLYGON (((-3626874 69… |
25477 | 3173031006 | KEAGUNGAN | 317303 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 21609 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 3 | 0 | APRIL | 2020 | APRIL_2020 | MULTIPOLYGON (((-3626874 69… |
25477 | 3173031006 | KEAGUNGAN | 317303 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 21609 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 540 | 4 | APRIL | 2021 | APRIL_2021 | MULTIPOLYGON (((-3626874 69… |
25477 | 3173031006 | KEAGUNGAN | 317303 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 21609 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 349 | 2 | JANUARI | 2021 | JANUARI_2021 | MULTIPOLYGON (((-3626874 69… |
25477 | 3173031006 | KEAGUNGAN | 317303 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 21609 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 6 | 0 | JUNI | 2020 | JUNI_2020 | MULTIPOLYGON (((-3626874 69… |
25477 | 3173031006 | KEAGUNGAN | 317303 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 21609 | DKI JAKARTA | JAKARTA BARAT | TAMAN SARI | KEAGUNGAN | 725 | 7 | JUNI | 2021 | JUNI_2021 | MULTIPOLYGON (((-3626874 69… |
The formula that will be used in this calculation is as such:
OBJECT_ID KODE_DESA DESA KODE
0 0 0 0
PROVINSI KAB_KOTA KECAMATAN DESA_KELUR
0 0 0 0
JUMLAH_PEN Nama_provinsi nama_kota nama_kecamatan
0 0 0 0
nama_kelurahan POSITIF Meninggal Month
0 0 0 0
Year MthYr geometry
0 0 0
dki_cov_wogeo <- st_set_geometry(dki_cov, NULL)
glimpse(dki_cov_wogeo)
Rows: 4,437
Columns: 18
$ OBJECT_ID <dbl> 25477, 25477, 25477, 25477, 25477, 25477, 254~
$ KODE_DESA <chr> "3173031006", "3173031006", "3173031006", "31~
$ DESA <chr> "KEAGUNGAN", "KEAGUNGAN", "KEAGUNGAN", "KEAGU~
$ KODE <dbl> 317303, 317303, 317303, 317303, 317303, 31730~
$ PROVINSI <chr> "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", ~
$ KAB_KOTA <chr> "JAKARTA BARAT", "JAKARTA BARAT", "JAKARTA BA~
$ KECAMATAN <chr> "TAMAN SARI", "TAMAN SARI", "TAMAN SARI", "TA~
$ DESA_KELUR <chr> "KEAGUNGAN", "KEAGUNGAN", "KEAGUNGAN", "KEAGU~
$ JUMLAH_PEN <dbl> 21609, 21609, 21609, 21609, 21609, 21609, 216~
$ Nama_provinsi <chr> "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", ~
$ nama_kota <chr> "JAKARTA BARAT", "JAKARTA BARAT", "JAKARTA BA~
$ nama_kecamatan <chr> "TAMAN SARI", "TAMAN SARI", "TAMAN SARI", "TA~
$ nama_kelurahan <chr> "KEAGUNGAN", "KEAGUNGAN", "KEAGUNGAN", "KEAGU~
$ POSITIF <dbl> 435, 3, 540, 349, 6, 725, 221, 144, 60, 260, ~
$ Meninggal <dbl> 3, 0, 4, 2, 0, 7, 2, 1, 1, 2, 0, 11, 0, 3, 0,~
$ Month <chr> "FEBRUARI", "APRIL", "APRIL", "JANUARI", "JUN~
$ Year <int> 2021, 2020, 2021, 2021, 2020, 2021, 2020, 202~
$ MthYr <chr> "FEBRUARI_2021", "APRIL_2020", "APRIL_2021", ~
cumu_cc_rate <- dki_cov_wogeo %>%
group_by(KODE_DESA, nama_kelurahan, MthYr, JUMLAH_PEN) %>%
summarise(`TOTAL_CONFIRMED` = sum(`POSITIF`)) %>%
ungroup() %>%
mutate(`cumu_cc_rate` = `TOTAL_CONFIRMED`/`JUMLAH_PEN`*10000) %>%
select(`KODE_DESA`, `nama_kelurahan`, `JUMLAH_PEN`, `MthYr`, `cumu_cc_rate`) %>%
pivot_wider(names_from=MthYr,
values_from=cumu_cc_rate)
glimpse(cumu_cc_rate)
Rows: 261
Columns: 20
$ KODE_DESA <chr> "3171011001", "3171011002", "3171011003", "31~
$ nama_kelurahan <chr> "GAMBIR", "CIDENG", "PETOJO UTARA", "PETOJO S~
$ JUMLAH_PEN <dbl> 3088, 19216, 21828, 18388, 12858, 25256, 1579~
$ AGUSTUS_2020 <dbl> 210.49223, 43.19317, 28.40388, 47.85730, 22.5~
$ APRIL_2020 <dbl> 6.4766839, 5.2039967, 5.0393989, 1.6314988, 5~
$ APRIL_2021 <dbl> 2014.2487, 387.6978, 259.7581, 414.9445, 323.~
$ DESEMBER_2020 <dbl> 1036.2694, 179.0175, 114.9899, 246.9002, 159.~
$ FEBRUARI_2021 <dbl> 1632.1244, 335.6578, 227.2311, 365.4557, 259.~
$ JANUARI_2021 <dbl> 1318.0052, 261.2406, 175.0046, 300.7396, 221.~
$ JULI_2020 <dbl> 35.621762, 20.815987, 19.241341, 16.858821, 8~
$ JULI_2021 <dbl> 3808.2902, 879.9958, 637.7130, 863.0629, 726.~
$ JUNI_2020 <dbl> 29.145078, 10.928393, 13.285688, 10.876659, 6~
$ JUNI_2021 <dbl> 2726.6839, 545.8993, 380.2456, 582.4451, 496.~
$ MARET_2020 <dbl> 0.0000000, 0.5203997, 0.9162544, 0.0000000, 0~
$ MARET_2021 <dbl> 1839.3782, 370.5246, 246.9305, 388.2967, 286.~
$ MEI_2020 <dbl> 3.238342, 8.326395, 8.246289, 2.175332, 5.444~
$ MEI_2021 <dbl> 2075.7772, 409.0341, 272.5857, 450.2937, 356.~
$ NOVEMBER_2020 <dbl> 783.67876, 130.62032, 87.50229, 186.53470, 12~
$ OKTOBER_2020 <dbl> 605.56995, 101.47794, 75.13286, 155.53622, 76~
$ SEPTEMBER_2020 <dbl> 511.65803, 82.74355, 56.34964, 126.16924, 56.~
cumu_death_rate <- dki_cov_wogeo %>%
group_by(KODE_DESA, nama_kelurahan, MthYr, JUMLAH_PEN) %>%
summarise(`TOTAL_DEATHS` = sum(`Meninggal`)) %>%
ungroup() %>%
mutate(`cumu_death_rate` = `TOTAL_DEATHS`/`JUMLAH_PEN`*10000) %>%
select(`KODE_DESA`, `nama_kelurahan`, `JUMLAH_PEN`, `MthYr`, `cumu_death_rate`) %>%
pivot_wider(names_from=MthYr,
values_from=cumu_death_rate)
glimpse(cumu_death_rate)
Rows: 261
Columns: 20
$ KODE_DESA <chr> "3171011001", "3171011002", "3171011003", "31~
$ nama_kelurahan <chr> "GAMBIR", "CIDENG", "PETOJO UTARA", "PETOJO S~
$ JUMLAH_PEN <dbl> 3088, 19216, 21828, 18388, 12858, 25256, 1579~
$ AGUSTUS_2020 <dbl> 0.0000000, 2.6019983, 0.9162544, 3.2629976, 0~
$ APRIL_2020 <dbl> 0.0000000, 1.0407993, 0.9162544, 0.5438329, 0~
$ APRIL_2021 <dbl> 25.906736, 6.244796, 3.665017, 7.069828, 5.44~
$ DESEMBER_2020 <dbl> 6.476684, 4.163197, 1.374382, 4.894496, 1.555~
$ FEBRUARI_2021 <dbl> 19.430052, 5.203997, 2.290636, 5.982162, 3.88~
$ JANUARI_2021 <dbl> 16.191710, 5.203997, 1.374382, 4.894496, 3.11~
$ JULI_2020 <dbl> 0.0000000, 1.5611990, 0.9162544, 1.0876659, 0~
$ JULI_2021 <dbl> 42.098446, 13.530391, 9.162544, 12.508157, 6.~
$ JUNI_2020 <dbl> 0.0000000, 1.5611990, 0.9162544, 0.5438329, 0~
$ JUNI_2021 <dbl> 29.145078, 7.285595, 5.497526, 9.245160, 6.99~
$ MARET_2020 <dbl> 0.0000000, 0.0000000, 0.4581272, 0.0000000, 0~
$ MARET_2021 <dbl> 25.906736, 5.724396, 3.206890, 7.069828, 4.66~
$ MEI_2020 <dbl> 0.0000000, 1.0407993, 0.9162544, 0.5438329, 0~
$ MEI_2021 <dbl> 29.145078, 6.765196, 3.665017, 8.157494, 6.22~
$ NOVEMBER_2020 <dbl> 3.2383420, 4.1631973, 1.3743815, 4.8944964, 0~
$ OKTOBER_2020 <dbl> 3.2383420, 3.6427977, 1.3743815, 4.3506635, 0~
$ SEPTEMBER_2020 <dbl> 0.0000000, 3.1223980, 1.3743815, 4.3506635, 0~
Relative risk also known as standardized mortality rate (SMR) is sometimes used to compare the observed mortality rate to a national (or regional) standard. This means that we would want to compare the death rates of the individual sub districts to the national DKI death rates.
rel_risk <- dki_cov_wogeo %>%
mutate(`TOTAL_DEATHS` = sum(`Meninggal`)) %>%
mutate(`REL_RISK` = (`TOTAL_DEATHS`*100)/ `TOTAL_DEATHS`/`JUMLAH_PEN`) %>%
select (`KODE_DESA`, `REL_RISK`)
glimpse(rel_risk)
Rows: 4,437
Columns: 2
$ KODE_DESA <chr> "3173031006", "3173031006", "3173031006", "3173031~
$ REL_RISK <dbl> 0.004627701, 0.004627701, 0.004627701, 0.004627701~
total_deaths <- dki_cov_wogeo %>%
group_by(KODE_DESA, nama_kelurahan, JUMLAH_PEN) %>%
summarise(`TOTAL_DEATHS` = sum(`Meninggal`)) %>%
ungroup() %>%
select (`KODE_DESA`, `TOTAL_DEATHS`, `JUMLAH_PEN`, nama_kelurahan)
glimpse(total_deaths)
Rows: 261
Columns: 4
$ KODE_DESA <chr> "3171011001", "3171011002", "3171011003", "31~
$ TOTAL_DEATHS <dbl> 62, 140, 86, 146, 58, 144, 96, 200, 153, 187,~
$ JUMLAH_PEN <dbl> 3088, 19216, 21828, 18388, 12858, 25256, 1579~
$ nama_kelurahan <chr> "GAMBIR", "CIDENG", "PETOJO UTARA", "PETOJO S~
months_col <- c('MARET_2020','APRIL_2020','MEI_2020','JUNI_2020','JULI_2020','AGUSTUS_2020','SEPTEMBER_2020','OKTOBER_2020','NOVEMBER_2020','DESEMBER_2020','JANUARI_2021','FEBRUARI_2021','MARET_2021','APRIL_2021','MEI_2021','JUNI_2021','JULI_2021')
The following code chunk will perform 2 tasks:
cc_boxplot_list <- vector(mode = "list", length = length(months_col))
for (i in 1:length(months_col)) {
box_plot <- ggplot(cumu_cc_rate, aes_string(x = months_col[[i]])) +
geom_boxplot(color="midnight blue", fill="light sky blue") +
labs(title = months_col[[i]]) +
theme(plot.title = element_text(size = 10),
axis.title = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank())
cc_boxplot_list[[i]] <- box_plot
}
ggarrange(plotlist = cc_boxplot_list,
ncol = 5,
nrow = 4)
Here are some of the observations that we can draw from the boxplots of the cumulative confirmed cases rates above:
The following code chunk will perform 2 tasks:
cc_histogram_list <- vector(mode = "list", length = length(months_col))
for (i in 1:length(months_col)) {
hist_plot <- ggplot(cumu_cc_rate, aes_string(x = months_col[[i]])) +
geom_histogram(fill = "steel blue") +
labs(title = months_col[[i]]) +
theme(plot.title = element_text(size = 10),
axis.title = element_blank())
cc_histogram_list[[i]] <- hist_plot
}
ggarrange(plotlist = cc_histogram_list,
ncol = 5,
nrow = 4)
Here are some of the observations that we can draw from the cumulative confirmed cases histogram above:
The following code chunk will perform 2 tasks:
# Create a list to store the boxplots
death_boxplot_list <- vector(mode = "list", length = length(months_col))
# For each month_year, create a boxplot using the ggplot2 package and store it in the boxplot_list created
for (i in 1:length(months_col)) {
box_plot <- ggplot(cumu_death_rate, aes_string(x = months_col[[i]])) +
geom_boxplot(color="firebrick", fill="light coral") +
labs(title = months_col[[i]]) +
theme(plot.title = element_text(size = 10),
axis.title = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank())
death_boxplot_list[[i]] <- box_plot
}
ggarrange(plotlist = death_boxplot_list,
ncol = 5,
nrow = 4)
Here are some of the observations that we can draw from the boxplots of the cumulative death cases rates above:
The following code chunk will perform 2 tasks:
death_histogram_list <- vector(mode = "list", length = length(months_col))
for (i in 1:length(months_col)) {
hist_plot <- ggplot(cumu_death_rate, aes_string(x = months_col[[i]])) +
geom_histogram(fill = "firebrick") +
labs(title = months_col[[i]]) +
theme(plot.title = element_text(size = 10),
axis.title = element_blank())
death_histogram_list[[i]] <- hist_plot
}
ggarrange(plotlist = death_histogram_list,
ncol = 5,
nrow = 4)
Here are some of the observations that we can draw from the histograms of the cumulative death cases rates above:
As seen from the previous sections, both the cumulative confirmed cases and death cases rates by month vary significantly. Hence, when we perform thematic mapping, we should carefully choose how we decide to define the intervals that are able to show significant spatio-temporal patterns.
Additionally, although it is known that the distribution for both the cumulative confirmed cases and death cases rates are right skewed, I will still be showing the difference between the maps with Manually Defined classification and Jenks classification. Equal and Quantiles classification will not be plotted for this analysis as:
summary(cumu_cc_rate)
KODE_DESA nama_kelurahan JUMLAH_PEN
Length:261 Length:261 Min. : 3088
Class :character Class :character 1st Qu.: 26177
Mode :character Mode :character Median : 38845
Mean : 42082
3rd Qu.: 52424
Max. :167523
AGUSTUS_2020 APRIL_2020 APRIL_2021
Min. : 2.429 Min. : 0.000 Min. : 90.04
1st Qu.: 14.134 1st Qu.: 1.393 1st Qu.: 264.21
Median : 19.532 Median : 2.220 Median : 315.17
Mean : 25.203 Mean : 3.370 Mean : 342.78
3rd Qu.: 31.381 3rd Qu.: 4.016 3rd Qu.: 372.94
Max. :210.492 Max. :49.883 Max. :2014.25
DESEMBER_2020 FEBRUARI_2021 JANUARI_2021
Min. : 42.06 Min. : 75.2 Min. : 59.22
1st Qu.: 106.54 1st Qu.: 217.0 1st Qu.: 163.49
Median : 124.56 Median : 256.6 Median : 192.08
Mean : 142.66 Mean : 280.1 Mean : 212.31
3rd Qu.: 152.73 3rd Qu.: 309.5 3rd Qu.: 229.76
Max. :1036.27 Max. :1632.1 Max. :1318.01
JULI_2020 JULI_2021 JUNI_2020
Min. : 1.518 Min. : 187.3 Min. : 0.000
1st Qu.: 7.477 1st Qu.: 545.1 1st Qu.: 4.179
Median : 10.690 Median : 658.3 Median : 6.515
Mean : 14.002 Mean : 705.3 Mean : 8.728
3rd Qu.: 15.758 3rd Qu.: 779.7 3rd Qu.: 10.631
Max. :112.243 Max. :3808.3 Max. :105.336
JUNI_2021 MARET_2020 MARET_2021
Min. : 119.4 Min. : 0.0000 Min. : 83.71
1st Qu.: 360.0 1st Qu.: 0.0000 1st Qu.: 247.10
Median : 423.9 Median : 0.3264 Median : 294.83
Mean : 465.3 Mean : 0.7609 Mean : 318.40
3rd Qu.: 503.4 3rd Qu.: 0.6928 3rd Qu.: 348.82
Max. :2726.7 Max. :49.8826 Max. :1839.38
MEI_2020 MEI_2021 NOVEMBER_2020 OKTOBER_2020
Min. : 0.000 Min. : 91.88 Min. : 28.85 Min. : 13.82
1st Qu.: 2.646 1st Qu.: 276.21 1st Qu.: 75.33 1st Qu.: 54.17
Median : 4.198 Median : 333.16 Median : 88.97 Median : 64.34
Mean : 5.458 Mean : 362.22 Mean :103.09 Mean : 76.97
3rd Qu.: 6.796 3rd Qu.: 391.92 3rd Qu.:110.95 3rd Qu.: 85.28
Max. :49.883 Max. :2075.78 Max. :783.68 Max. :605.57
SEPTEMBER_2020
Min. : 6.681
1st Qu.: 32.867
Median : 41.312
Mean : 51.530
3rd Qu.: 58.183
Max. :511.658
summary(cumu_death_rate)
KODE_DESA nama_kelurahan JUMLAH_PEN
Length:261 Length:261 Min. : 3088
Class :character Class :character 1st Qu.: 26177
Mode :character Mode :character Median : 38845
Mean : 42082
3rd Qu.: 52424
Max. :167523
AGUSTUS_2020 APRIL_2020 APRIL_2021 DESEMBER_2020
Min. :0.0000 Min. :0.0000 Min. : 1.679 Min. :0.000
1st Qu.:0.3321 1st Qu.:0.0000 1st Qu.: 4.273 1st Qu.:1.632
Median :0.6899 Median :0.2074 Median : 5.706 Median :2.444
Mean :0.8827 Mean :0.3482 Mean : 5.772 Mean :2.583
3rd Qu.:1.2247 3rd Qu.:0.4854 3rd Qu.: 6.882 3rd Qu.:3.211
Max. :6.1069 Max. :6.1069 Max. :25.907 Max. :9.160
FEBRUARI_2021 JANUARI_2021 JULI_2020 JULI_2021
Min. : 1.388 Min. : 0.000 Min. :0.0000 Min. : 3.439
1st Qu.: 3.243 1st Qu.: 2.415 1st Qu.:0.2155 1st Qu.: 8.446
Median : 4.486 Median : 3.266 Median :0.5490 Median :10.298
Mean : 4.637 Mean : 3.506 Mean :0.6912 Mean :10.602
3rd Qu.: 5.631 3rd Qu.: 4.337 3rd Qu.:0.9707 3rd Qu.:12.508
Max. :19.430 Max. :16.192 Max. :6.1069 Max. :42.098
JUNI_2020 JUNI_2021 MARET_2020 MARET_2021
Min. :0.0000 Min. : 2.339 Min. :0.00000 Min. : 1.388
1st Qu.:0.0000 1st Qu.: 5.503 1st Qu.:0.00000 1st Qu.: 4.002
Median :0.4144 Median : 7.253 Median :0.00000 Median : 5.306
Mean :0.5579 Mean : 7.351 Mean :0.08094 Mean : 5.438
3rd Qu.:0.7939 3rd Qu.: 8.617 3rd Qu.:0.00000 3rd Qu.: 6.441
Max. :6.1069 Max. :29.145 Max. :3.05344 Max. :25.907
MEI_2020 MEI_2021 NOVEMBER_2020 OKTOBER_2020
Min. :0.0000 Min. : 2.064 Min. :0.000 Min. :0.0000
1st Qu.:0.0000 1st Qu.: 4.697 1st Qu.:1.166 1st Qu.:0.9442
Median :0.3372 Median : 6.157 Median :1.839 Median :1.4868
Mean :0.4611 Mean : 6.350 Mean :2.022 Mean :1.6846
3rd Qu.:0.6447 3rd Qu.: 7.544 3rd Qu.:2.639 3rd Qu.:2.2127
Max. :6.1069 Max. :29.145 Max. :6.107 Max. :6.1069
SEPTEMBER_2020
Min. :0.0000
1st Qu.:0.5694
Median :0.9934
Mean :1.2192
3rd Qu.:1.6270
Max. :6.1069
cumu_cc_dkij <- left_join(dkij_slice, cumu_cc_rate, by = c('KODE_DESA' ='KODE_DESA', 'JUMLAH_PEN' = 'JUMLAH_PEN'))
glimpse(cumu_cc_dkij)
Rows: 261
Columns: 28
$ OBJECT_ID <dbl> 25477, 25478, 25397, 25400, 25390, 25391, 253~
$ KODE_DESA <chr> "3173031006", "3173031007", "3171031003", "31~
$ DESA <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMP~
$ KODE <dbl> 317303, 317303, 317103, 317103, 317102, 31710~
$ PROVINSI <chr> "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", ~
$ KAB_KOTA <chr> "JAKARTA BARAT", "JAKARTA BARAT", "JAKARTA PU~
$ KECAMATAN <chr> "TAMAN SARI", "TAMAN SARI", "KEMAYORAN", "KEM~
$ DESA_KELUR <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMP~
$ JUMLAH_PEN <dbl> 21609, 9069, 29085, 41913, 15793, 33383, 3590~
$ nama_kelurahan <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMP~
$ AGUSTUS_2020 <dbl> 27.766209, 8.821259, 44.352759, 44.139050, 38~
$ APRIL_2020 <dbl> 1.3883104, 2.2053148, 3.7820182, 3.1016630, 1~
$ APRIL_2021 <dbl> 249.8959, 347.3371, 322.5030, 371.7224, 464.1~
$ DESEMBER_2020 <dbl> 120.32024, 94.82854, 136.49648, 160.80930, 18~
$ FEBRUARI_2021 <dbl> 201.3050, 250.3032, 278.1502, 320.6642, 385.6~
$ JANUARI_2021 <dbl> 161.5068, 168.7066, 204.5728, 245.0314, 279.2~
$ JULI_2020 <dbl> 8.792633, 6.615944, 24.755028, 22.904588, 16.~
$ JULI_2021 <dbl> 545.1432, 853.4568, 652.9139, 694.7725, 939.6~
$ JUNI_2020 <dbl> 2.776621, 2.205315, 10.658415, 12.168062, 11.~
$ JUNI_2021 <dbl> 335.5084, 527.0702, 438.0265, 480.2806, 609.7~
$ MARET_2020 <dbl> 0.0000000, 1.1026574, 0.6876397, 0.0000000, 1~
$ MARET_2021 <dbl> 231.3851, 309.8467, 306.6873, 350.4879, 443.2~
$ MEI_2020 <dbl> 1.851081, 2.205315, 5.501117, 7.157684, 14.56~
$ MEI_2021 <dbl> 263.7790, 378.2115, 339.3502, 379.3572, 480.5~
$ NOVEMBER_2020 <dbl> 102.27220, 61.74881, 109.33471, 128.36113, 13~
$ OKTOBER_2020 <dbl> 84.68694, 47.41427, 92.48754, 103.30924, 116.~
$ SEPTEMBER_2020 <dbl> 66.63890, 31.97706, 63.60667, 63.70339, 93.71~
$ geometry <MULTIPOLYGON [m]> MULTIPOLYGON (((-3626874 69..., ~
cumu_death_dkij <- left_join(dkij_slice, cumu_death_rate, by = c('KODE_DESA' ='KODE_DESA', 'JUMLAH_PEN' = 'JUMLAH_PEN' ))
glimpse(cumu_death_dkij)
Rows: 261
Columns: 28
$ OBJECT_ID <dbl> 25477, 25478, 25397, 25400, 25390, 25391, 253~
$ KODE_DESA <chr> "3173031006", "3173031007", "3171031003", "31~
$ DESA <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMP~
$ KODE <dbl> 317303, 317303, 317103, 317103, 317102, 31710~
$ PROVINSI <chr> "DKI JAKARTA", "DKI JAKARTA", "DKI JAKARTA", ~
$ KAB_KOTA <chr> "JAKARTA BARAT", "JAKARTA BARAT", "JAKARTA PU~
$ KECAMATAN <chr> "TAMAN SARI", "TAMAN SARI", "KEMAYORAN", "KEM~
$ DESA_KELUR <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMP~
$ JUMLAH_PEN <dbl> 21609, 9069, 29085, 41913, 15793, 33383, 3590~
$ nama_kelurahan <chr> "KEAGUNGAN", "GLODOK", "HARAPAN MULIA", "CEMP~
$ AGUSTUS_2020 <dbl> 0.4627701, 1.1026574, 2.0629190, 2.1473051, 0~
$ APRIL_2020 <dbl> 0.0000000, 0.0000000, 0.6876397, 1.1929473, 0~
$ APRIL_2021 <dbl> 1.851081, 9.923917, 5.157298, 6.919094, 6.965~
$ DESEMBER_2020 <dbl> 0.9255403, 3.3079722, 3.7820182, 4.5331997, 3~
$ FEBRUARI_2021 <dbl> 1.388310, 5.513287, 4.813478, 5.726147, 4.432~
$ JANUARI_2021 <dbl> 0.9255403, 4.4106296, 4.4696579, 4.7717892, 4~
$ JULI_2020 <dbl> 0.0000000, 1.1026574, 1.7190992, 1.6701262, 0~
$ JULI_2021 <dbl> 5.090472, 14.334546, 11.689875, 13.361010, 12~
$ JUNI_2020 <dbl> 0.0000000, 1.1026574, 1.0314595, 1.6701262, 0~
$ JUNI_2021 <dbl> 3.239391, 12.129231, 7.564036, 9.782168, 9.49~
$ MARET_2020 <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0~
$ MARET_2021 <dbl> 1.388310, 6.615944, 4.813478, 6.919094, 6.331~
$ MEI_2020 <dbl> 0.0000000, 1.1026574, 0.6876397, 1.4315368, 0~
$ MEI_2021 <dbl> 2.776621, 11.026574, 5.844937, 7.634863, 8.23~
$ NOVEMBER_2020 <dbl> 0.9255403, 3.3079722, 3.0943785, 4.2946103, 2~
$ OKTOBER_2020 <dbl> 0.4627701, 2.2053148, 3.0943785, 4.0560208, 1~
$ SEPTEMBER_2020 <dbl> 0.4627701, 1.1026574, 2.7505587, 2.8630735, 0~
$ geometry <MULTIPOLYGON [m]> MULTIPOLYGON (((-3626874 69..., ~
The following code chunk will perform 2 tasks:
cc_map_list_manual <- vector(mode = "list", length = length(months_col))
for (i in 1:length(months_col)) {
cmap <- tm_shape(cumu_cc_dkij) +
tm_fill(col = months_col[[i]],
palette = 'Blues',
breaks = c(0, 450, 900, 1350, 1800, 2250, 2700, 3150, 3600, 4050)
) +
tm_borders(lwd = 0.1,
alpha = 0.3) +
tm_layout(panel.show = TRUE,
panel.labels = months_col[[i]],
panel.label.color = 'black',
panel.label.size = 0.5,
legend.show = TRUE,
legend.text.size = 0.5,
inner.margins = 0
)
cc_map_list_manual[[i]] <- cmap
}
tmap_arrange(cc_map_list_manual, outer.margins=0, ncol = 3)
The following code chunk will perform 2 tasks:
death_map_list_manual <- vector(mode = "list", length = length(months_col))
for (i in 1:length(months_col)) {
cmap <- tm_shape(cumu_death_dkij) +
tm_fill(col = months_col[[i]],
palette = 'Reds',
breaks = c(0, 5, 10, 15, 20, 25, 30, 35, 40, 45)) +
tm_borders(lwd = 0.1,
alpha = 0.3) +
tm_layout(panel.show = TRUE,
panel.labels = months_col[[i]],
panel.label.color = 'black',
panel.label.size = 0.5,
legend.show = TRUE,
legend.text.size = 0.5,
inner.margins = 0
)
death_map_list_manual[[i]] <- cmap
}
tmap_arrange(death_map_list_manual, ncol = 3, outer.margins=0)
From both the cumulative confirmed and death cases maps above, some observations that can be found are:
Some disadvantages that are commonly known when using this manual classification are:
Therefore, since our distribution for both cumulative confirmed cases and death cases rates are right skewed, it would be better to use Natural Breaks classification to plot the choropleth maps for both cumulative confirmed cases and death rates instead.
The following code chunk will perform 2 tasks:
cc_map_list_jenks <- vector(mode = "list", length = length(months_col))
for (i in 1:length(months_col)) {
cmap <- tm_shape(cumu_cc_dkij) +
tm_fill(col = months_col[[i]],
palette = 'Blues',
style = 'jenks',
n = 5) +
tm_borders(lwd = 0.1,
alpha = 0.3) +
tm_layout(panel.show = TRUE,
panel.labels = months_col[[i]],
panel.label.color = 'black',
panel.label.size = 0.5,
legend.show = TRUE,
legend.text.size = 0.5,
inner.margins = 0)
cc_map_list_jenks[[i]] <- cmap
}
tmap_arrange(cc_map_list_jenks, ncol = 3, outer.margins=0)
The following code chunk will perform 2 tasks:
death_map_list_jenks <- vector(mode = "list", length = length(months_col))
for (i in 1:length(months_col)) {
cmap <- tm_shape(cumu_death_dkij) +
tm_fill(col = months_col[[i]],
palette = 'Reds',
style = 'jenks',
n = 5) +
tm_borders(lwd = 0.1,
alpha = 0.3) +
tm_layout(panel.show = TRUE,
panel.labels = months_col[[i]],
panel.label.color = 'black',
panel.label.size = 0.5,
legend.show = TRUE,
legend.text.size = 0.5,
inner.margins = 0)
death_map_list_jenks[[i]] <- cmap
}
tmap_arrange(death_map_list_jenks, ncol = 3, outer.margins=0)
Overall, although choropleth maps provides a quick and easy way to spot spatial patterns, the observations are not conclusive as our data contains outliers. The next section will attempt to determine if these are more significant patterns that can be derived using extreme value maps.
As it is evident from the boxplots created earlier, the COVID data contains outliers. Hence, we can also make use of extreme value maps like percentile and box map to highlight these values at the lower and upper end of the scale.
# Extract variable as vector out of sf dataframe
get.var <- function(vname, df) {
v <- df[vname]
v <- unname(v[[1]])
return(v)
}
# To create break points for box map
boxbreaks <- function(v, mult=1.5) {
qv <- unname(quantile(v))
iqr <- qv[4] - qv[2]
upfence <- qv[4] + mult * iqr
lofence <- qv[2] - mult * iqr
# initialize break points vector
bb <- vector(mode="numeric",length=7)
# logic for lower and upper fences
if (lofence < qv[1]) { # no lower outliers
bb[1] <- lofence
bb[2] <- floor(qv[1])
} else {
bb[2] <- lofence
bb[1] <- qv[1]
}
if (upfence > qv[5]) { # no upper outliers
bb[7] <- upfence
bb[6] <- ceiling(qv[5])
} else {
bb[6] <- upfence
bb[7] <- qv[5]
}
bb[3:5] <- qv[2:4]
return(bb)
}
# Boxmap function
boxmap <- function(vnam, df,
legtitle=NA,
mult=1.5){
var <- get.var(vnam,df)
bb <- boxbreaks(var)
tm_shape(df) +
tm_fill(vnam,title=legtitle,
breaks=bb,
palette="Blues",
labels = c("lower outlier",
"< 25%",
"25% - 50%",
"50% - 75%",
"> 75%",
"upper outlier")) +
tm_borders(lwd=0.1, alpha=0.3) +
tm_layout(legend.show = TRUE)
}
var <- get.var("JULI_2021", cumu_cc_dkij)
boxbreaks(var)
[1] 187.3189 193.3497 545.1432 658.3395 779.6722 1131.4658
[7] 3808.2902
tmap_arrange(cc_map_list_boxmap, ncol = 3, outer.margins=0)
# Extract variable as vector out of sf dataframe
get.var <- function(vname, df) {
v <- df[vname]
v <- unname(v[[1]])
return(v)
}
boxmap <- function(vnam, df,
legtitle=NA,
mult=1.5){
var <- get.var(vnam,df)
bb <- boxbreaks(var)
tm_shape(df) +
tm_fill(vnam,title=legtitle,
breaks=bb,
palette="Reds",
labels = c("lower outlier",
"< 25%",
"25% - 50%",
"50% - 75%",
"> 75%",
"upper outlier")) +
tm_borders(lwd=0.1, alpha=0.3) +
tm_layout(title = NA,
legend.show = TRUE)
}
tmap_arrange(death_map_list_boxmap, ncol = 3, outer.margins=0)
# Creating the percentmap function
percentmap <- function(vnam, df, legtitle=NA){
percent <- c(0,.01, .1,.5,.9,.99,1)
var <- get.var(vnam,df)
bperc <- quantile (var, percent)
tm_shape (cumu_cc_dkij) +
tm_polygons() +
tm_shape(df)+
tm_fill(vnam, title=legtitle, breaks=bperc, palette="Blues",
labels = c("< 1%", "1%-10%", "10%-50%", "50%-90%", "90%-99%", ">99%"))+
tm_borders(lwd=0.1, alpha = 0.3) +
tm_layout(legend.show = TRUE)
}
tmap_arrange(cc_map_list_percentile, ncol = 3, outer.margins=0)
# Creating the percentmap function
percentmap <- function(vnam, df, legtitle=NA){
percent <- c(0,.01, .1,.5,.9,.99,1)
var <- get.var(vnam,df)
bperc <- quantile (var, percent)
tm_shape (cumu_death_dkij) +
tm_polygons() +
tm_shape(df)+
tm_fill(vnam, title=legtitle, breaks=bperc, palette="Reds",
labels = c("< 1%", "1%-10%", "10%-50%", "50%-90%", "90%-99%", ">99%"))+
tm_borders(lwd=0.1, alpha = 0.3) +
tm_layout(legend.show = TRUE)
}
tmap_arrange(death_map_list_percentile, ncol = 3, outer.margins=0)
rel_risk_geo <- left_join(dkij_slice, rel_risk, by = c('KODE_DESA' ='KODE_DESA'))
total_deaths_geo <- left_join(dkij_slice, total_deaths, by = c('KODE_DESA' ='KODE_DESA', 'JUMLAH_PEN'='JUMLAH_PEN'))
relative_risk_map <- tm_shape(rel_risk_geo) +
tm_fill(col = "REL_RISK", n = 5, style = "jenks", title = "Relative Risk", palette="Greens") +
tm_layout(main.title = "Relative Risk",
main.title.position = "center",
main.title.size = 1.5) +
tm_borders(alpha = 0.3)
total_death_map <- tm_shape(total_deaths_geo)+
tm_fill(col="TOTAL_DEATHS", style="jenks", n = 5, title= "Total Deaths", palette="Reds") +
tm_layout(main.title = "Total Deaths",
main.title.position = "center",
main.title.size = 1.5) +
tm_borders(alpha = 0.3)
total_pop_map <- tm_shape(total_deaths_geo)+
tm_fill(col="JUMLAH_PEN", style="jenks", n = 5, title= "Total Population", palette="Purples") +
tm_layout(main.title = "Total Population",
main.title.position = "center",
main.title.size = 1.5) +
tm_borders(alpha = 0.3)
tmap_arrange(relative_risk_map, total_death_map, total_pop_map, ncol=3)
Some of the conclusions that we can derive from this analysis includes:
Anselin, L. (2020, September 15). Basic mapping. GeoDa on Github. Retrieved from: https://geodacenter.github.io/workbook/3a_mapping/lab3a.html
Anselin, L. (2018, August 3). Maps for rates or proportions. GeoDa on Github. Retrieved from: https://geodacenter.github.io/workbook/3b_rates/lab3b.html
Gimond, M. (2021, September 07). E mapping rates in R | Intro to GIS and spatial analysis. Main repos | mgimond.github.io. Retrieved from: https://mgimond.github.io/Spatial/mapping-rates-in-r.html#raw-rates
Lovelace, R., Nowosad, J., & Muenchow, J. (n.d.). Chapter 8 making maps with R. Welcome | Geocomputation with R. Retrieved from: https://geocompr.robinlovelace.net/adv-map.html
Situation report #5: COVID-19 pandemic - 24th March 2020 - Indonesia. (24 March 2020). ReliefWeb. Retrieved from: https://reliefweb.int/report/indonesia/situation-report-5-covid-19-pandemic-24th-march-2020
Tin Seong.K (2021, August 30) In-class Exercise 3: Analytical Mapping. Retrieved from: https://is415.netlify.app/in-class_ex/in-class_ex03/in-class_ex03#1