From: ibidyouadu <60790401+ibidyouadu@users.noreply.github.com> Date: Mon, 11 May 2020 19:44:52 +0000 (-0400) Subject: Add files via upload X-Git-Url: http://git.angelumana.com/?a=commitdiff_plain;h=1c6524d1bd0a9175b09fd50fb7911896e19d7e65;p=tweet_classification%2F.git Add files via upload --- diff --git a/power_outages/outages_daily.csv b/power_outages/outages_daily.csv new file mode 100644 index 0000000..2a7b9ed --- /dev/null +++ b/power_outages/outages_daily.csv @@ -0,0 +1,31 @@ +index,date,Alachua,Baker,Bay,Bradford,Brevard,Broward,Calhoun,Charlotte,Citrus,Clay,Collier,Columbia,Desoto,Dixie,Duval,Escambia,Flagler,Franklin,Gadsden,Gilchrist,Glades,Gulf,Hamilton,Hardee,Hendry,Hernando,Highlands,Hillsborough,Holmes,Indian River,Jackson,Jefferson,Lafayette,Lake,Lee,Leon,Levy,Liberty,Madison,Manatee,Marion,Martin,Miami-Dade,Monroe,Nassau,Okaloosa,Okeechobee,Orange,Osceola,Palm Beach,Pasco,Pinellas,Polk,Putnam,Santa Rosa,Sarasota,Seminole,St. Johns,St. Lucie,Sumter,Suwannee,Taylor,Union,Volusia,Wakulla,Walton,Washington +0,2017-09-01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +1,2017-09-02,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2,2017-09-03,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +3,2017-09-04,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +4,2017-09-05,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +5,2017-09-06,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +0,2017-09-07,,,,,,,,,,,,,,,,,,,,,,,,,,,,1.0,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,, +1,2017-09-08,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,, +2,2017-09-09,515.0,,,,315.0,7823.0,,60.0,,,,67.0,340.0,12.0,1250.0,,27.0,1583.0,,710.0,,47.0,18.0,,,62.0,,,,108.0,,1.0,695.0,,,,,,76.0,51.0,202.0,95.0,27403.0,3343.0,,,,58.0,,1629.0,215.0,144.0,,15.0,,95.0,174.0,,,32.0,14.0,8.0,118.0,1266.0,,, +3,2017-09-10,2459.0,71.0,1657.0,,25968.0,428378.0,,14115.0,1148.0,,,886.0,4753.0,275.0,4712.0,210.0,3679.0,14.0,,302.0,2229.0,,17.0,2886.0,,196.0,,30453.0,,15442.0,99.0,4.0,60.0,4391.0,,,288.0,7.0,715.0,29768.0,3791.0,21774.0,633480.0,52321.0,1066.0,196.0,5351.0,12728.0,6123.0,285376.0,4443.0,9821.0,7110.0,844.0,88.0,34880.0,8429.0,,28048.0,395.0,370.0,15.0,87.0,3853.0,1952.0,24.0,278.0 +4,2017-09-11,64713.0,7685.0,2202.0,10851.0,235657.0,630644.0,911.0,66761.0,64487.0,55039.0,236581.0,27476.0,14447.0,6649.0,247776.0,1398.0,46517.0,5308.0,,6194.0,,3631.0,5038.0,11595.0,18750.0,53097.0,61781.0,253140.0,188.0,61530.0,3189.0,5398.0,3177.0,119579.0,328470.0,41894.0,17367.0,1353.0,6307.0,103271.0,136526.0,61451.0,729949.0,,33352.0,52.0,19211.0,343147.0,65607.0,472220.0,169849.0,412702.0,202643.0,31907.0,12.0,140088.0,147308.0,79750.0,95919.0,26203.0,18344.0,8458.0,3663.0,202975.0,6983.0,84.0,377.0 +5,2017-09-12,34085.0,6496.0,1033.0,,184115.0,458337.0,6.0,43685.0,54434.0,37217.0,,20991.0,9302.0,5344.0,137960.0,0.0,39878.0,4923.0,,4897.0,,3327.0,4282.0,,13042.0,36645.0,56747.0,216931.0,,32521.0,152.0,5320.0,3377.0,91550.0,,12304.0,12430.0,152.0,5966.0,80151.0,115994.0,52366.0,646326.0,,27574.0,,,271377.0,35384.0,390341.0,127987.0,378983.0,147476.0,30042.0,0.0,118386.0,129357.0,52403.0,65539.0,21907.0,17062.0,6163.0,3876.0,162468.0,5276.0,, +6,2017-09-13,22373.0,5743.0,658.0,5646.0,124089.0,306143.0,,38613.0,45463.0,29913.0,,11774.0,6707.0,3877.0,97233.0,0.0,26084.0,3056.0,,3257.0,,1906.0,2805.0,6839.0,11945.0,22427.0,59922.0,161292.0,0.0,,9.0,3640.0,2497.0,54072.0,,4098.0,8982.0,478.0,3805.0,55750.0,73763.0,29897.0,435234.0,53231.0,18186.0,0.0,12399.0,165549.0,20055.0,281431.0,83011.0,236660.0,107586.0,25646.0,,81969.0,83143.0,43273.0,33701.0,15244.0,10694.0,2821.0,2675.0,119164.0,3607.0,0.0,0.0 +7,2017-09-14,27310.0,4176.0,,3636.0,93459.0,203012.0,,32240.0,21463.0,14889.0,,6086.0,5570.0,2326.0,65861.0,,21071.0,0.0,20.0,1698.0,3142.0,,2387.0,3860.0,9498.0,9682.0,50855.0,71886.0,,11288.0,,1860.0,1335.0,30147.0,,105.0,3793.0,,1581.0,41120.0,48815.0,14765.0,300288.0,44438.0,9292.0,,11404.0,98708.0,9199.0,155183.0,47979.0,146350.0,62131.0,19727.0,,66809.0,50283.0,28736.0,21358.0,10358.0,7946.0,1403.0,1283.0,74910.0,1233.0,, +8,2017-09-15,10576.0,2977.0,,3055.0,71043.0,146977.0,,23898.0,12000.0,7294.0,119634.0,3278.0,5429.0,1611.0,36193.0,,15213.0,,,904.0,2277.0,,1096.0,2180.0,7374.0,4530.0,46793.0,31454.0,,,8.0,917.0,884.0,18544.0,147380.0,,1872.0,,561.0,33742.0,31333.0,5211.0,246191.0,34441.0,7458.0,,8915.0,58928.0,,93553.0,26986.0,88100.0,36833.0,16079.0,,56400.0,33554.0,,11056.0,7935.0,5047.0,768.0,486.0,49907.0,788.0,, +9,2017-09-16,6724.0,1602.0,,2111.0,41730.0,87387.0,,15094.0,6759.0,3883.0,99491.0,1568.0,3948.0,762.0,8258.0,,10217.0,,,400.0,1437.0,,458.0,1282.0,6110.0,2924.0,,11091.0,,3275.0,,316.0,589.0,10124.0,107750.0,,786.0,,258.0,24042.0,16421.0,1567.0,180713.0,26415.0,,,4327.0,27093.0,2972.0,47110.0,6046.0,16850.0,21461.0,9385.0,,42536.0,19656.0,,2499.0,4150.0,2468.0,400.0,102.0,29762.0,,, +10,2017-09-17,3643.0,877.0,,,15522.0,33510.0,,9118.0,4222.0,752.0,87487.0,374.0,3081.0,263.0,1105.0,,3616.0,,,192.0,976.0,,156.0,1509.0,,2039.0,36784.0,,,656.0,,,295.0,12005.0,92759.0,,325.0,,,17013.0,19194.0,399.0,107507.0,21957.0,,,893.0,39536.0,,12045.0,1430.0,16189.0,,4868.0,,24777.0,31755.0,,543.0,3965.0,1157.0,,36.0,23323.0,,, +11,2017-09-18,1238.0,,,,1542.0,10582.0,,5223.0,,,67551.0,,2018.0,,274.0,,545.0,,,,,,,,,,22972.0,,,,,,,,62808.0,,,,,,,74.0,44438.0,18998.0,,,,19174.0,,1654.0,,1161.0,,,,18116.0,15024.0,,,,,,,,,, +12,2017-09-19,,,,,116.0,2700.0,,1295.0,,,47058.0,,480.0,,25.0,,20.0,,,,,,,,,,13784.0,,,,,,,,40264.0,,,,,,,10.0,13431.0,11058.0,,,,5699.0,,48.0,,,,,,3661.0,4287.0,,,,,,,,,, +13,2017-09-20,,,,,,164.0,,68.0,,,29497.0,,116.0,,,,0.0,,,,,,,,,,1029.0,,,,,,,,25585.0,,,,,,,,1045.0,,,,,0.0,,0.0,,,,,,362.0,5.0,,,,,,,,,, +14,2017-09-21,,,,,,0.0,,53.0,,,11350.0,,5.0,,,,,,,,,,,,,,,,,,,,,,10926.0,,,,,,,,5.0,,,,,0.0,,,,,,,,20.0,,,,,,,,,,, +15,2017-09-22,,,,,0.0,,,,,,4781.0,,,,,,,,,,,,,,,,,,,,,,,,3308.0,,,,,,,,,,,,,0.0,,,,,,,,0.0,,,,,,,,,,, +16,2017-09-23,,,,,,,,,,,1462.0,,,,,,,,,,,,,,,,,,,,,,,,1973.0,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,, +17,2017-09-24,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,, +18,2017-09-25,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,, +19,2017-09-26,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,, +20,2017-09-27,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,, +21,2017-09-28,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,, +0,2017-09-29,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +1,2017-09-30,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, diff --git a/power_outages/outages_daily_normalized.csv b/power_outages/outages_daily_normalized.csv new file mode 100644 index 0000000..5eade48 --- /dev/null +++ b/power_outages/outages_daily_normalized.csv @@ -0,0 +1,31 @@ +date,Alachua,Baker,Bay,Bradford,Brevard,Broward,Calhoun,Charlotte,Citrus,Clay,Collier,Columbia,Desoto,Dixie,Duval,Escambia,Flagler,Franklin,Gadsden,Gilchrist,Glades,Gulf,Hamilton,Hardee,Hendry,Hernando,Highlands,Hillsborough,Holmes,Indian River,Jackson,Jefferson,Lafayette,Lake,Lee,Leon,Levy,Liberty,Madison,Manatee,Marion,Martin,Miami-Dade,Monroe,Nassau,Okaloosa,Okeechobee,Orange,Osceola,Palm Beach,Pasco,Pinellas,Polk,Putnam,Santa Rosa,Sarasota,Seminole,St. Johns,St. Lucie,Sumter,Suwannee,Taylor,Union,Volusia,Wakulla,Walton,Washington +2017-09-01,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-02,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-03,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-04,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-05,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-06,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-07,,,,,,,,,,,,,,,,,,,,,,,,,,,,1.5828294659533383e-06,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-08,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-09,0.004128057969155792,,,,0.001024003952980339,0.008382085074466946,,0.0005258545135845749,,,,0.002127389344002032,0.020481927710843374,0.001273074474856779,0.0027263051367950866,,0.0004653888582460011,0.15519607843137254,,0.07999098693105003,,0.004308369236410303,0.002680565897244974,,,0.0006544019083204036,,,,0.0012157370405808522,,0.0001232589670898558,0.1742728184553661,,,,,,0.007097497198356369,0.00024313848882278065,0.001068076668869795,0.0010171306209850108,0.02409932054391664,0.05249438625692885,,,,0.00010235739673374018,,0.0022043301759133963,0.0008729794585903208,0.0002659559732049357,,0.00036349537149226965,,0.0003601213040181956,0.0008263246126009754,,,0.00045029832263874815,0.0006152223589383019,0.0006193868070610096,0.021699154100772344,0.0044477546919244795,,, +2017-09-10,0.019850014933927462,0.006248349907594825,0.01433031505937092,,0.083594081971646,0.4589928211721847,,0.12370727432077125,0.013088736617678915,,,0.026556364835296584,0.2705949331056077,0.02918081494057725,0.010277303996196176,0.0013752545858191606,0.06343103448275862,0.001372279945108802,,0.03400900900900901,0.30297675683022973,,0.002532777115613826,0.23421522480116863,,0.002064744487869626,,0.04811447747611103,,0.16983975099261997,0.0037842590115056765,0.0004930358683594232,0.015045135406218655,0.024686985219293068,,,0.012388162422573986,0.0017249876786594382,0.06677250653716847,0.14191659873091245,0.020048548053582665,0.23312633832976445,0.5571082574229214,0.8210048958071805,0.023796767568532905,0.0018079346191807104,0.24078657246996354,0.021939078034721934,0.04009009303939606,0.37214203093967896,0.016451107663483577,0.01779434226523191,0.022459984268534225,0.020452672902631707,0.0012772504281691776,0.1322038395209127,0.04004218467195242,,0.18233114477020088,0.005477894269706551,0.0162630214056525,0.001161350263239393,0.015998528870908423,0.013436979068583346,0.12617154676491502,0.00040385683275278916,0.021237585943468296 +2017-09-11,0.49766980435585084,0.6763178738009329,0.019043665516436187,0.8578543758399874,0.7587691336797842,0.6756750386241448,0.23145325203252032,0.5807425320551853,0.7352129697190807,0.6058828062218602,0.9620594442704881,0.8235470431316129,0.822487902077996,0.6637715882998901,0.5417479294391339,0.00895373264333658,0.8017545976318102,0.5202901391883944,,0.6959550561797753,,0.3328444403703364,0.7505959475566151,0.943526731223045,1.0,0.5592925760512345,0.9891923914435763,0.39993490827912975,0.018037033483641947,0.6721284614124201,0.12189900997668285,0.6653519043510415,0.7854140914709518,0.6722944672142669,0.7482505057131923,0.29132505823858695,0.7028328611898017,0.3334154756037457,0.5889988793425476,0.49233637018073295,0.7220121528787726,0.6579336188436831,0.6419470471010987,,0.7445307616751495,0.00047965612345610685,0.8429574374725757,0.5914777506584481,0.42955915956812957,0.615794284909506,0.6288654068303664,0.7477609858004012,0.6127105934714512,0.7732031212135899,0.00015736466638690725,0.5309682187730969,0.7231721625748048,0.6657706242799659,0.6235389715920172,0.35703287869084765,0.8060816452080678,0.6548467017652524,0.6735932328061788,0.7083529637578739,0.4513606101738737,0.001412334386979622,0.02880061115355233 +2017-09-12,0.2621277839301095,0.5716800140807885,0.008933744994767748,,0.5928140434931,0.4910928961748634,0.001524390243902439,0.38286590709903595,0.6205850833390337,0.40920285871357887,,0.6291700386655876,0.5295758610873897,0.5334930617949486,0.3066922762275694,0.0,0.6873621070049641,0.4825524406979024,,0.5502247191011236,,0.30497754147951234,0.6379618593563766,,0.7303169447866502,0.3867301279074676,0.9140652685158339,0.34272844903176064,,0.3552460538532962,0.006138190041594314,0.6557377049180327,0.8348578491965389,0.531769680705851,,0.08586302669960502,0.5346696490020647,0.0374384236453202,0.557153530070975,0.38211358858107236,0.6134295127742727,0.5606638115631691,0.5684055559561898,,0.6155460308956157,,,0.4677768546321418,0.23169046823946937,0.5090207042604327,0.47392236511280866,0.6866666545875557,0.4658661940909077,0.7280085300247177,0.0,0.4487717968157695,0.61451380740417,0.40072953069917183,0.4260482350646818,0.2976090205135172,0.7497473304917168,0.4771601114896253,0.7127620448694373,0.6248625031730037,0.3410692352446829,, +2017-09-13,0.17437356299442733,0.505412303088973,0.0056906139463283436,0.4463593959996838,0.39945725653803066,0.3280220722168649,,0.3384136722173532,0.518309505894156,0.31540821813810777,,0.35290591373677427,0.38183888414460576,0.3870420285514625,0.21614922572814455,0.0,0.44957686275185715,0.2995491080180357,,0.3659550561797753,,0.1747181226510221,0.4179082240762813,0.5564233992352128,0.6688878933811178,0.23622287760690963,0.9594274369155886,0.2548246078302812,0.0,,0.00034402354650051606,0.4486626402070751,0.6173053152039555,0.3040024287810555,,0.028904751156754317,0.36349656009712666,0.11747358073236668,0.3553418005229735,0.26578374023274554,0.3900926009170188,0.3200963597430407,0.3827626054669568,0.8346949335925862,0.4059737476560407,0.0,0.5440544098288723,0.2853545277789269,0.1313092954279092,0.36699758882801914,0.3073400099964827,0.42879635887280154,0.3252966147817568,0.6214801531527165,,0.3106828131218375,0.3949729932590034,0.3309117604325184,0.21907950334785153,0.20709142779513653,0.4699213428835084,0.2184112728398885,0.4919087899963222,0.4158648728820953,0.23326650714609068,0.0,0.0 +2017-09-14,0.21002522456010828,0.3675085804805069,,0.287453553640604,0.3037519256895106,0.21752062573663344,,0.28255915863277825,0.24469297945595914,0.15699237655394932,,0.1824176482930192,0.31710788499857673,0.2322052510731756,0.14640918366893266,,0.3631929123000552,0.0,0.0008960573476702509,0.19078651685393258,0.4331403363661428,,0.3556317044100119,0.3140509315759499,0.5318624706014111,0.10217822618092785,0.814253234276931,0.11357241374952011,,0.12330547818012999,,0.22926167878713177,0.3300370828182942,0.16949181129720522,,0.0007406048978670579,0.15350060704168353,,0.1476466193500187,0.19603636589005372,0.25815612588647796,0.15808351177730193,0.2640855660873496,0.6968152666488953,0.20742923475310296,,0.5003949100482668,0.1701093643905684,0.0602340215163599,0.20236500892616127,0.17766117774264142,0.26516668267148863,0.18785905204213685,0.4780448795618669,,0.2532226581007069,0.23887070493057105,0.2197462701404767,0.1388415783657284,0.14071457682380112,0.3492593732143642,0.10862496128832456,0.23593232806178743,0.2614249070826572,0.07973873116471578,, +2017-09-15,0.08537363072029965,0.26199067147760274,,0.24152106885919836,0.2308974850657497,0.1574809814636237,,0.2094478527607362,0.13680826321909845,0.07690928837292675,0.4864930808300564,0.098252555225849,0.3090805579276971,0.16082659478885894,0.08398519532654979,,0.2622072079835916,,,0.10157303370786516,0.31389578163771714,,0.1632896305125149,0.17736555202994062,0.41292417963937733,0.04771434590267538,0.7492154476751633,0.049694053112948354,,,0.000323611504388981,0.11302847282139776,0.2185414091470952,0.1042576756790186,0.33572977602828347,,0.0805230557467309,,0.05239073589839372,0.16086233117369148,0.16570328571957418,0.05579229122055675,0.21651044863800978,0.540056136609537,0.166488079292794,,0.3911803422553752,0.10157338076917773,,0.12199695636808908,0.09991299357633425,0.15962545092830988,0.11136811678337748,0.38964280521494693,,0.2137983320697498,0.15939915345624528,,0.0718715465123838,0.1100402163361531,0.2217779144878499,0.05946113347785692,0.0893710923133505,0.19194557048683492,0.050960356981180885,, +2017-09-16,0.054278772027543006,0.1409838950981255,,0.166890663293541,0.13562704350595745,0.09363227258116362,,0.13130012700290541,0.0770572542581572,0.04094307194297705,0.4045813322706182,0.04699817162725175,0.22475236251850164,0.07607067984426474,0.019162982816434034,,0.17609748530653752,,,0.0449438202247191,0.19809760132340778,,0.06823599523241955,0.10464451881479063,0.3258666666666667,0.030798398988835053,,0.017522628062431177,,0.035774755584685125,,0.03894983360039443,0.14561186650185415,0.05691893381009406,0.24545313724418202,,0.0338093599449415,,0.024094135225999253,0.11461834408386848,0.08684178517221866,0.01677730192719486,0.1589264136573663,0.41420350304988,,,0.18986397542781921,0.046699830389277296,0.01945904891606812,0.061433375888541,0.022384716488642886,0.03052995287334871,0.06488939685304114,0.22742693743033005,,0.1612219758561222,0.09337634142981335,,0.016245205746603392,0.057550963805297464,0.10845014720745265,0.03096934035305048,0.01875689591761677,0.11446658923255618,,, +2017-09-17,0.029407728509271144,0.07718032209803749,,,0.050459013835431187,0.0359048537447766,,0.07931592407661929,0.048133707275919466,0.008268279274326553,0.35576692380576713,0.01121002307945928,0.17534574014000343,0.027907470288624787,0.002564292974533438,,0.06232763375620518,,,0.021573033707865168,0.13454645712710228,,0.023241954707985697,0.12317361848012408,,0.021476722140299137,0.588873769310814,,,0.007165874706428532,,,0.07397191574724173,0.0697312399440059,0.21130382884114227,,0.013979697178251893,,,0.08110813941846995,0.10150668196794137,0.004271948608137045,0.09454605896123952,0.3442993116209054,,,0.03918385256691531,0.06814883252720885,,0.015707174964497483,0.005806894367312463,0.029927920833079144,,0.11796636456162458,,0.09391096708169878,0.15085295696498388,,0.003529870636416824,0.054985438912772154,0.050854907476594435,,0.0066200809121000365,0.08970177611285893,,, +2017-09-18,0.010062668151411457,,,,0.005013003901170351,0.0113382620807886,,0.04577563540753725,,,0.27469694320302873,,0.12156626506024096,,0.0006358518325992416,,0.009393960286817429,,,,,,,,,,0.36775794444889137,,,,,,,,0.1430758296429938,,,,,,,0.0007922912205567452,0.0390805972459427,0.3016321605487108,,,,0.0338637009877943,,0.0022381596752368066,,0.002146291684922162,,,,0.0686732373009856,0.07137190443841655,,,,,,,,,, +2017-09-19,,,,,0.00037711313394018207,0.0028929604628736743,,0.011264983733189514,,,0.19136191548975034,,0.02891566265060241,,5.801567815686511e-05,,0.0003448275862068965,,,,,,,,,,0.2241956995543411,,,,,,,,0.09172088276565889,,,,,,,0.00010706638115631691,0.012056552962298026,0.1766735900303563,,,,0.015437290578078994,,6.495263870094723e-05,,,,,,0.013877937831690675,0.02036550547973188,,,,,,,,,, +2017-09-20,,,,,,0.00017572056144862317,,0.0005959684487291849,,,0.11994990057378482,,0.006987951807228916,,,,0.0,,,,,,,,,,0.016574852614284333,,,,,,,,0.058282306416634774,,,,,,,,0.0009380610412926391,,,,,,,0.0,,,,,,0.001372251705837756,2.3752630603839375e-05,,,,,,,,,, +2017-09-21,,,,,,0.0,,0.0004610379442926982,,,0.04615490970310397,,0.00030120481927710846,,,,,,,,,,,,,,,,,,,,,,0.02488928981466295,,,,,,,,4.488330341113106e-06,,,,,,,,,,,,,7.581501137225171e-05,,,,,,,,,,, +2017-09-22,,,,,0.0,,,,,,0.019441993241457275,,,,,,,,,,,,,,,,,,,,,,,,0.007535582162447834,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,, +2017-09-23,,,,,,,,,,,0.005945240351183965,,,,,,,,,,,,,,,,,,,,,,,,0.004494469046707853,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-24,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-25,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-26,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-27,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-28,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-29,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +2017-09-30,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, diff --git a/power_outages/outages_original.xlsx b/power_outages/outages_original.xlsx new file mode 100644 index 0000000..7739010 Binary files /dev/null and b/power_outages/outages_original.xlsx differ diff --git a/power_outages/power_outages.py b/power_outages/power_outages.py new file mode 100644 index 0000000..3284ca7 --- /dev/null +++ b/power_outages/power_outages.py @@ -0,0 +1,174 @@ +import numpy as np +import pandas as pd +from scipy.stats.stats import pearsonr +from copy import deepcopy +import random + +def clean_outage_data(): + # transform original dataset into nominal and 'per-capita' time series + # note that the original dataset consists of 'customers' which does not + # mean 1 person necessarily. the 'per-capita' timeseries is normalized by customers, + # hence why I use the term 'per-capita' lightly + + pwr = pd.read_excel('outages_original.xlsx') + pwr = pwr.rename({'County':'county', 'PowerProvider':'provider', \ + 'Customers':'customers', 'Out': 'out', 'CreatedOn': 'date'},axis=1) + pwr.county = pwr.county.apply(lambda x: ' '.join([w.capitalize() for w in x.split()])) + + def rename_counties(x): + # this is really just to fix the capitalization on Miami-Dade + if x=='Miami-dade': + return 'Miami-Dade' + else: + return x + + pwr.county = pwr.county.apply(rename_counties) + counties = pwr.county.unique() + + # the following is for date formatting so that dates.sort() works properly + pwr.date = pwr.date.apply(lambda x: np.datetime_as_string(np.datetime64(x))[:10]) + dates = pwr.date.unique() + dates.sort() + + # the following will be converted to dataframes. they are the time series + out_dict = {} + out_percent_dict = {} + + for county in counties: + county_customers = np.array([]) # customers for each date + county_out = np.array([]) # affected customers for each date + countyDF = pwr[pwr.county==county] + providers = countyDF.provider.unique() + + county_customers_total = 0 # total number of customers between all + # providers of the county. We calculate it like so: go through each + # day, and each provider. Average the customer count for the provider's + # counts (they can have multiple readings in a day). Add up the averages + # of each provider. That's the count for the day. Finally, take the + # maximum count from the whole month. + # The reason we use this number will be clear later + + # the following loop finds county_customers_total + # We have two separate loops across the dates because we want to know + # this number first. It does look awkward, don't hesitate to modify it + for date in dates: + count = 0 # the customers we count for the given date + today_countyDF = countyDF[countyDF.date==date] + for prov in providers: + today_prov_countyDF = today_countyDF[today_countyDF.provider==prov] + count += round(np.nanmean(today_prov_countyDF.customers)) + county_customers_total = max(county_customers_total, count) + + for date in dates: + date_customers = 0 + date_out = 0 + for prov in providers: + # again, a given provider on a given day for the given county + # can have multiple readings (rows). So we take averages + rows = countyDF[countyDF.date==date][countyDF.provider==prov] + if rows.shape[0] > 0: + date_customers += round(np.nanmean(rows.customers.values)) + date_out += round(np.nanmean(rows.out.values)) + + # The following is why we needed county_customers_total + # There are some cases where date_customers is very small + # This causes the 'per-capita' time series to have outliers + # So if the customers counted for on a given day are not enough + # (for what ever reason -- missing data, readings that weren't + # recorded, etc.) we simply don't write anyything for that day + # the threshold of 90% of county_customer_total is somewhat arbitrary + # I haven't tried fiddling with it, but if there comes any time to + # review how the timeseries were produced, this may be something + # to look at + + if date_customers < 0.9*county_customers_total: + date_customers = np.nan + date_out = np.nan + + # record numbers, compute 'per-capita' numbers + county_customers = np.append(county_customers, date_customers) + county_out = np.append(county_out, date_out) + county_out_percent = county_out / county_customers + out_percent_dict.update({county: county_out_percent}) + out_dict.update({county: county_out}) + + pwr_ts = pd.DataFrame(out_dict, index=dates).reset_index().rename({'index':'date'},axis=1) + pwr_percent_ts = pd.DataFrame(out_percent_dict, index=dates).reset_index().rename({'index':'date'},axis=1) + + return pwr_ts, pwr_percent_ts + +def impact_ratio(twt, c, pp_date): + pp_idx = twt[twt.date==pp_date].index.values[0] + before = twt.iloc[pp_idx-1:pp_idx][c] + beforesum = np.nansum(before.values) + after = twt.iloc[pp_idx:pp_idx+1][c] + aftersum = np.nansum(after.values) + ir_out = beforesum/aftersum + return ir_out + +def county_active_filter(twtDF,tol): + out = np.array([]) + for c in twtDF.county.values: + if twtDF[twtDF.county==c].total.values[0] >= tol: + out = np.append(out, c) + + return out + +def corr_iter(df1, col1, df2, col2): + out = {} # {tol: pearsonr corr} + tols = deepcopy(df1[col1].unique()) #cut off counties using values from df1[col1] + tols.sort() + + for tol in tols[:-1]: + active = df1[df1[col1]>=tol].county.values + df1_act = df1[df1.county.isin(active)] + df2_act = df2[df2.county.isin(active)] + corr = pearsonr(df1[col1], df2[col2]) + out.update({tol: corr}) + return out + +def DTWD(t1, t2): + DTW = {} + idxs = range(len(t1)) # assuming t1, t2 equal length + + # initialize norms + for i in idxs: + DTW[(i,-1)] = float('inf') + DTW[(-1,i)] = float('inf') + DTW[(-1,-1)] = 0 + + for i in idxs: + for j in idxs: + dist = (t1[i] - t2[j])**2 + DTW[(i, j)] = dist + min(DTW[(i, j-1)], DTW[(i-1, j)], DTW[(i-1, j-1)]) + + return np.sqrt(DTW[idxs[-1], idxs[-1]]) + + +# ntwt = pd.read_csv('daily_activity_normalized.csv') +# X = ntwt[test_counties].transpose().values +def kmeans_DTWD(data, num_clust, num_iter, do_random, centroids=[]): + if do_random==1: + centroids = random.sample(data.tolist(), num_clust) + if centroids == []: + print("Don't forget to include the initial centroids!") + return True + + for n in range(num_iter): + labels = {} + for idxi, ti in enumerate(data): + nearest_clust = 0 + nearest_dist = DTWD(ti, centroids[0]) + + for idxj, tj in enumerate(centroids): + if DTWD(ti, tj) < nearest_dist: + nearest_dist = DTWD(ti, tj) + nearest_clust = idxj + + if nearest_clust not in labels: + labels[nearest_clust] = [] + labels[nearest_clust].append(idxi) + + for label in labels: + centroids[label] = np.sum(data[labels[label]], axis=0) / len(labels[label]) + return labels \ No newline at end of file diff --git a/power_outages/readme.txt b/power_outages/readme.txt new file mode 100644 index 0000000..c8d1ab2 --- /dev/null +++ b/power_outages/readme.txt @@ -0,0 +1,7 @@ +power_outages is the code used to produce outages_daily.csv and outages_daily_normalized.csv from outages_original.xlsx + +There are also some functions to produce metrics and for finding patterns in the data using dynamic time warping and kmeans, mostly copied from some medium article. This was fruitless and can be ignored for the most part. + +It is important to understand the data here: all the numbers in outages_daily.csv and outages_original.xlsx represent customers, which are not necessarily individual people. I tried finding more information about the customers, particularly the square footage of the properties, in order to get a hint of exactly how many people would a given customer consist of. I consulted the original source of the outage data, Florida Division of Emergency Management, but to no avail. + +Similarly, outages_daily_normalized.csv has customers without power normalized by customers in total. The customers in total bit is a little complicated. See the code and comments to see what I mean. \ No newline at end of file diff --git a/readme.txt b/readme.txt index 7df7b64..9942e47 100644 --- a/readme.txt +++ b/readme.txt @@ -4,4 +4,6 @@ For raw tweets, keyword scoring, and labeling results, see the data folder. Be c The documentation is written in LaTeX. The pdf and source code can be found in the documentation folder along with the used media. -For random forest and BERT model notebooks and their results, see the modeling folder. Unlike the previous folder, all the stuff there is up to date and has been carefully maintained. \ No newline at end of file +For random forest and BERT model notebooks and their results, see the modeling folder. Unlike the previous folder, all the stuff there is up to date and has been carefully maintained. + +power_outages is not related to classification exactly, but is the next step in the project: using the refined, labeled, power-related twitter data to analyze outage data. It contains only data from Florida during Irma \ No newline at end of file