Launching
84

Multilingual comment comparison

In this project we want to explore how comments in different languages on swissinfo.ch differ

⛶ Full screen

Data Science notebook | Sources

Our pitch at #swihack: youtube.com

Challenge

Swissinfo.ch is the international unit of the Swiss Broadcasting Corporation and provides independent reporting on Switzerland. It reaches about one million users worldwide every month who are allowed to comment on the articles.

Our goal is to compare the comments based on their:

  • language
  • topic
  • number
  • publication date
  • length

According to these criteria, firstly we would be able to answer to the following questions:

  • Do any language communities comment more on specific topics?
  • Are time-trends in multilingual comments on swissinfo.ch?
  • Do any language communities comment more/less positive on specific articles?
  • Are positive/negative comments influenced by translation quality (human translation vs. automatic translation + post-editing)?
  • Who is the most hated/loved author/translator?
  • Are articles in specific languages longer or more detailed?

Then, we could conduct a sentiment analysis based on words used in the articles and finally represent words as real-valued vectors in a predefined vector space (Word Vector Embedding).

IF YOU WANT TO PARTICIPATE:

We've provided the "articles.json" file in the Github repository, containing ~450 Articles in different languages with language specific comments. Feel free to add your own analysis. We will gather more articles across the hackaton and upload them periodically.

The json file contains a list of articles. Each article has all available different language version in the "content" tag with all comments.

Challenge by Valentina V. Baldassarre, Damian Murezzan, Samuel Pawel and Hubert Zumwald.

These contents were scraped from an external site. Visit the original location to see all the formatting.

swihack-comments

Repository for multilingual comments comparison project at swihack hackathon 2020

22.02.2020 12:15

Hackathon finished

22.02.2020 09:33 ~ hubihack

Worked on documentation

21.02.2020 16:47

Team forming

Damian has joined!

21.02.2020 14:57 ~ Samuel

Worked on documentation

21.02.2020 14:13

Team forming

hubihack has joined!

21.02.2020 14:07 ~ VVBaldassarre

Worked on documentation

21.02.2020 11:27

Team forming

VVBaldassarre has joined!

21.02.2020 11:27

Team forming

Samuel has joined!

21.02.2020 11:27

Project started

Initialized by Samuel 🎉

21.02.2020 08:00

Hackathon started

Connect to the community on Forum | Telegram | Twitter | Medium

All attendees, sponsors, partners, volunteers and staff at our hackathon are required to agree with the Hack Code of Conduct. Organisers will enforce this code throughout the event. We expect cooperation from all participants to ensure a safe environment for everybody. For more details on how the event is run, see the Guidelines on our wiki.

Creative Commons LicenceThe contents of this website, unless otherwise stated, are licensed under a Creative Commons Attribution 4.0 International License.