[KS] KCNA and Rodong Sinmun Article Data
Univ.-Prof. Dr. Ruediger Frank
ruediger.frank at univie.ac.at
Wed Jun 17 02:57:38 EDT 2020
Dear Mr. Fisher and all,
I think this is a great initiative, especially since it is not hidden behind a paywall. Reminds me of the good old www.nk-news.net (aka STALIN, Statistical Analyzer of Language in North Korea which includes the priceless random insult generator (looks like the latter has been used a lot in the last few days). So it is much appreciated. But few things are ever really perfect, so please allow me to make a few suggestions.
For those who have more grants - and the necessary skill - this is what the research community would need:
(1) Make Korean language articles searchable as well.
(2) Regularly update the database (I guess this needs to happen automatically to save HR costs).
(3) Allow to compare various versions. The folks at KCNA have a habit of editing or deleting already published articles. Having permanent offline backups of the database, like 4 times a year or so, would thus be helpful.
I have been trying to establish such a tool here at the University of Vienna, but the software guy I hired is still struggling with the Korean searches.
Best wishes,
Rüdiger Frank
on Dienstag, 16. Juni 2020 at 17:51 you wrote:
> Greetings,
> Thanks to a recent grant, we've been able to assemble databases of articles from the Korean Central News Agency (KCNA) and the Rodong Sinmun. For KCNA the articles run from 1 October 2008 to 27 Feb 2020, just over 85,000 articles. The Rodong Sinmun database is smaller, running from 2 Jan 2018 to 31 Dec 2019, just over 7,100 articles. Both represent all articles available on the respective websites at the time of the scrape/collection earlier this year.
> We added sentiment and topic analysis to the data, put everything into Tableau, and made both databases searchable on the affiliated project's website: https://focusdataproject.com/north-korea/. Note the interesting spike in reporting in Dec 2011. You can run searches using the Search Article Text feature - comparing KCNA sentiment regarding Trump and Moon is quite interesting.
> For those who would like access to the full databases, we set up a Harvard Dataverse: https://dataverse.harvard.edu/dataverse/focusdataproject.
> We are adding similar data for state media and foreign ministry postings from China, Russia, and Iran. The project and affiliated website (https://focusdataproject.com/) are new and just emerging from beta; please let me know of any technical or related issues.
> Happy to answer any questions. A colleague and I will also be presenting (virtually, unfortunately) on the databases and associated methodology at APSA in September.
> Be well,
> Scott
> Scott Fisher, PhD
> Assistant Professor, Professional Security Studies
> New Jersey City University
> sfisher1 at njcu.edu
More information about the Koreanstudies
mailing list