Analysis of restaurants in Yuen Long District
Hello! Welcome to read my analysis report. In this report, Yuen Long District is chosen as the study area. First, through ParseHub, the information on 1040 restaurants in Yuen Long District in Openrice is collected. Then, I use Python to preprocess the data. There are 840 valid data after removing duplicate and null values. Besides, I extract the information to form new columns and export a new dataset. The data is then analyzed by SQL and visualized by Python to form the final data report.
The report mainly studies the following questions:
1. What are the highest-rated and most reviewed restaurants in Yuen Long District?
2. What are the TOP 10 most popular types of restaurants in Yuen Long District?
3. How are the ratings, number of reviews, and per capita spending of restaurants in Yuen Long District distributed?
4. What is the relationship between ratings and restaurant reviews in Yuen Long District?
5. For restaurants in Yuen Long District, what is the connection between rating and price per capita?
If you want to learn more in detail, you can visit my page.
During the analysis process, I feel that Python is very powerful in data analysis and visualization because of its package, including pandas and matplotlib. Similarly, the syntax of SQL is not that much, but the combinations of syntax are diverse and can answer many questions.
Because I am not skilled in Python, I am a bit rusty in data preprocessing and visualization, and I need to study more in these aspects in the future. However, collecting data from scrapers, organizing data, and analyzing data to get a conclusion, this process has been very beneficial to me. Without this assignment, I probably wouldn’t have had this experience.