The second assignment to do data analysis
I collected the data of Ma On Shan Plaza and MOSTown using Parsehub. These are the two large shopping malls connected by an MTR station in Ma On Shan, Shatin. I always have a meal in these two malls, so I chose these malls to finish my assignment 2.
I collected nine pages of data on Openrice, and the total number of data is about 120. I made a big mistake when I used Parsehub the first time. I collected different pages’ data nine times, and the data layout on Openrefine is weird. Therefore, I watched the tutorial and checked the lesson’s pdf to collect the data again; then, I acquired the correct data format. I deleted about 20 pieces of data because of the blanks in the column, and I sorted the data of restaurants_likes from the largest to the smallest for later work. There were Chinese in the column of restaurants_comments, so I split the Chinese and numbers and deleted the Chinese column.
Nextly, I connected the new CSV. File with the database through SQLite, so I can use Python to analyze data on Jupyter Notebook. This process is not so easy because I spent the most time on this one—my first CSV. The file had a special notation in the file name, so I had to export a new one. However, I spent much time checking why my first file was wrong.
Coding should not be so hard as I could search for my demands on Google. I analyzed the restaurants with more than 100 comments, the top 10 “liked” restaurants and the number of different price ranges.
If you are interested in my analysis, you can click the link Here