Attractive Restaurants In Ma On Shan
The second assignment gave me a complete experience of the data collection, cleaning, and analysis process. You can access my website here.
First, I chose the Ma On Shan area where I live and scratched the data of Ma On Shan restaurants in openrice, including restaurant name, number of favorites, categories, price range, number of likes, and number of dislikes, about 17 pages and 250 rows. Then I use openrefine to clean the data and convert the text into numbers for calculation. I found that the data was not successfully collected because when the number of likes and the number of dislikes were the same, the site displayed a different size than the one set as a scratch model at the beginning. Unfortunately, I did not find a way to directly populate another column of data, and I edited the individual data that was not captured. In addition, I modified the price range to two columns, the lowest price and the highest price.
Then I applied sql for data analysis. I think restaurants with more than tens of times more likes than dislikes are well received, but to exclude restaurants with too low reputation, I chose to include the number of favorites greater than 1000 as a filtering condition as well. In order to analyze the price factor of popular restaurants, I chose to take the average of the lowest and highest price as the average price of restaurants and grouped the statistics according to the different types of restaurants.