CS615A -- Aspects of System Administration - Scripting ExerciseText ProcessingThe Wikimedia project provides analytics data files for the page views of its projects, including the popular Wikipedia. This data is available from https://dumps.wikimedia.org/, with page views at https://dumps.wikimedia.org/other/pageviews/. The format of these files is described in more detail here, but the files we are interested in are made up of these four space-separated fields: domain_code page_title count_views total_response_size Using this format, fetch a file for a specific hour (e.g., this file for the hour starting at 18:00 UTC on March 29th, 2020) and answer the following questions:
Can you generalize these questions to make your answer more flexible for other areas of interest? [Course Website] |