Google Local Data (2021)

Tianyang Zhang, UCSD

Jiacheng Li, UCSD

Description

This Dataset contains review information on Google map (ratings, text, images, etc.), business metadata (address, geographical info, descriptions, category information, price, open hours, and MISC info), and links (relative businesses) up to Sep 2021 in the United States.

Reviews: 666,324,103
Users: 113,643,107
Businesses: 4,963,111

Citation

Please cite the following papers if you use the data in any way:

UCTopic: Unsupervised Contrastive Learning for Phrase Representations and Topic Mining
Jiacheng Li, Jingbo Shang, Julian McAuley
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
pdf


Personalized Showcases: Generating Multi-Modal Explanations for Recommendations
An Yan, Zhankui He, Jiacheng Li, Tianyang Zhang, Julian Mcauley
The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
pdf

Contact

Jiacheng Li (j9li@eng.ucsd.edu)



Directory

  • Files
  • Code
  • Files

    Complete review data

    Please only download these (large!) files if you really need them. We recommend using the smaller datasets (i.e. k-core and CSV files) as shown in the next section.

    Alabama reviews (8,967,499 reviews) metadata (74,967 businesses)
    Alaska reviews (1,051,246 reviews) metadata (12,774 businesses)
    Arizona reviews (18,375,050 reviews) metadata (108,579 businesses)
    Arkansas reviews (5,106,056 reviews) metadata (47,246 businesses)
    California reviews (70,529,977 reviews) metadata (515,961 businesses)
    Colorado reviews (15,681,222 reviews) metadata (106,829 businesses)
    Connecticut reviews (5,181,800 reviews) metadata (49,200 businesses)
    Delaware reviews (1,885,948 reviews) metadata (14,706 businesses)
    District of Columbia reviews (1,894,317 reviews) metadata (11,060 businesses)
    Florida reviews (61,803,524 reviews) metadata (378,020 businesses)
    Georgia reviews (24,060,125 reviews) metadata (166,381 businesses)
    Hawaii reviews (3,111,531 reviews) metadata (21,507 businesses)
    Idaho reviews (3,892,636 reviews) metadata (33,214 businesses)
    Illinois reviews (23,096,838 reviews) metadata (179,205 businesses)
    Indiana reviews (12,865,167 reviews) metadata (100,391 businesses)
    Iowa reviews (4,838,887 reviews) metadata (47,794 businesses)
    Kansas reviews (5,546,880 reviews) metadata (46,286 businesses)
    Kentucky reviews (7,654,993 reviews) metadata (63,193 businesses)
    Louisiana reviews (7,536,078 reviews) metadata (63,315 businesses)
    Maine reviews (2,214,773 reviews) metadata (24,853 businesses)
    Maryland reviews (10,728,483 reviews) metadata (78,144 businesses)
    Massachusetts reviews (10,447,007 reviews) metadata (92,520 businesses)
    Michigan reviews (20,776,155 reviews) metadata (158,819 businesses)
    Minnesota reviews (9,520,258 reviews) metadata (80,964 businesses)
    Mississippi reviews (3,861,771 reviews) metadata (37,147 businesses)
    Missouri reviews (13,416,511 reviews) metadata (99,569 businesses)
    Montana reviews (1,933,939 reviews) metadata (21,680 businesses)
    Nebraska reviews (3,286,810 reviews) metadata (30,016 businesses)
    Nevada reviews (8,833,403 reviews) metadata (48,237 businesses)
    New Hampshire reviews (2,648,081 reviews) metadata (24,767 businesses)
    New Jersey reviews (15,720,266 reviews) metadata (127,276 businesses)
    New Mexico reviews (4,705,389 reviews) metadata (34,703 businesses)
    New York reviews (33,459,761 reviews) metadata (272,189 businesses)
    North Carolina reviews (22,299,136 reviews) metadata (166,235 businesses)
    North Dakota reviews (1,109,558 reviews) metadata (11,987 businesses)
    Ohio reviews (23,039,365 reviews) metadata (173,761 businesses)
    Oklahoma reviews (8,482,820 reviews) metadata (68,102 businesses)
    Oregon reviews (11,012,170 reviews) metadata (93,476 businesses)
    Pennsylvania reviews (21,944,802 reviews) metadata (190,816 businesses)
    Rhode Island reviews (1,777,094 reviews) metadata (15,941 businesses)
    South Carolina reviews (11,995,482 reviews) metadata (84,929 businesses)
    South Dakota reviews (1,452,599 reviews) metadata (14,257 businesses)
    Tennessee reviews (15,951,213 reviews) metadata (111,395 businesses)
    Texas reviews (66,435,184 reviews) metadata (447,314 businesses)
    Utah reviews (9,081,167 reviews) metadata (58,797 businesses)
    Vermont reviews (852,203 reviews) metadata (11,291 businesses)
    Virginia reviews (15,957,938 reviews) metadata (119,673 businesses)
    Washington reviews (16,541,734 reviews) metadata (121,304 businesses)
    West Virginia reviews (2,208,199 reviews) metadata (23,541 businesses)
    Wisconsin reviews (10,246,685 reviews) metadata (92,041 businesses)
    Wyoming reviews (1,141,421 reviews) metadata (12,088 businesses)
    Other reviews (162,952 reviews) metadata (1,224 businesses)

    "Small" subsets for experimentation

    If you're using this data for a class project (or similar) please consider using one of these smaller datasets below before requesting the larger files.

    K-cores (i.e., dense subsets): These data have been reduced to extract the k-core, such that each of the remaining users and items have k reviews each.

    Ratings only: These datasets include no metadata or reviews, but only (business,user,rating,timestamp) tuples. Thus they are suitable for use with mymedialite (or similar) packages.

    You can directly download the following smaller per-category datasets.

    Alabama 10-core (5,146,330 reviews) ratings only (8,967,499 ratings)
    Alaska 10-core (521,515 reviews) ratings only (1,051,246 ratings)
    Arizona 10-core (10,764,435 reviews) ratings only (18,375,050 ratings)
    Arkansas 10-core (2,855,468 reviews) ratings only (5,106,056 ratings)
    California 10-core (44,476,890 reviews) ratings only (70,529,977 ratings)
    Colorado 10-core (8,738,271 reviews) ratings only (15,681,222 ratings)
    Connecticut 10-core (2,680,107 reviews) ratings only (5,181,800 ratings)
    Delaware 10-core (905,537 reviews) ratings only (1,885,948 ratings)
    District of Columbia 10-core (564,783 reviews) ratings only (1,894,317 ratings)
    Florida 10-core (35,457,319 reviews) ratings only (61,803,524 ratings)
    Georgia 10-core (13,599,687 reviews) ratings only (24,060,125 ratings)
    Hawaii 10-core (1,504,347 reviews) ratings only (3,111,531 ratings)
    Idaho 10-core (2,085,487 reviews) ratings only (3,892,636 ratings)
    Illinois 10-core (13,237,848 reviews) ratings only (23,096,838 ratings)
    Indiana 10-core (7,638,803 reviews) ratings only (12,865,167 ratings)
    Iowa 10-core (2,677,684 reviews) ratings only (4,838,887 ratings)
    Kansas 10-core (3,080,115 reviews) ratings only (5,546,880 ratings)
    Kentucky 10-core (4,240,662 reviews) ratings only (7,654,993 ratings)
    Louisiana 10-core (3,985,782 reviews) ratings only (7,536,078 ratings)
    Maine 10-core (1,123,881 reviews) ratings only (2,214,773 ratings)
    Maryland 10-core (5,590,890 reviews) ratings only (10,728,483 ratings)
    Massachusetts 10-core (5,624,944 reviews) ratings only (10,447,007 ratings)
    Michigan 10-core (13,212,364 reviews) ratings only (20,776,155 ratings)
    Minnesota 10-core (5,646,319 reviews) ratings only (9,520,258 ratings)
    Mississippi 10-core (1,971,181 reviews) ratings only (3,861,771 ratings)
    Missouri 10-core (7,863,559 reviews) ratings only (13,416,511 ratings)
    Montana 10-core (950,370 reviews) ratings only (1,933,939 ratings)
    Nebraska 10-core (1,817,866 reviews) ratings only (3,286,810 ratings)
    Nevada 10-core (4,170,080 reviews) ratings only (8,833,403 ratings)
    New Hampshire 10-core (1,296,603 reviews) ratings only (2,648,081 ratings)
    New Jersey 10-core (8,227,961 reviews) ratings only (15,720,266 ratings)
    New Mexico 10-core (2,571,363 reviews) ratings only (4,705,389 ratings)
    New York 10-core (18,661,975 reviews) ratings only (33,459,761 ratings)
    North Carolina 10-core (12,905,081 reviews) ratings only (22,299,136 ratings)
    North Dakota 10-core (563,693 reviews) ratings only (1,109,558 ratings)
    Ohio 10-core (14,506,563 reviews) ratings only (23,039,365 ratings)
    Oklahoma 10-core (5,011,462 reviews) ratings only (8,482,820 ratings)
    Oregon 10-core (6,270,332 reviews) ratings only (11,012,170 ratings)
    Pennsylvania 10-core (12,772,358 reviews) ratings only (21,944,802 ratings)
    Rhode Island 10-core (890,006 reviews) ratings only (1,777,094 ratings)
    South Carolina 10-core (6,504,999 reviews) ratings only (11,995,482 ratings)
    South Dakota 10-core (673,048 reviews) ratings only (1,452,599 ratings)
    Tennessee 10-core (8,855,714 reviews) ratings only (15,951,213 ratings)
    Texas 10-core (40,696,824 reviews) ratings only (66,435,184 ratings)
    Utah 10-core (4,933,807 reviews) ratings only (9,081,167 ratings)
    Vermont 10-core (324,725 reviews) ratings only (852,203 ratings)
    Virginia 10-core (8,562,059 reviews) ratings only (15,957,938 ratings)
    Washington 10-core (10,192,020 reviews) ratings only (16,541,734 ratings)
    West Virginia 10-core (1,080,333 reviews) ratings only (2,208,199 ratings)
    Wisconsin 10-core (6,036,482 reviews) ratings only (10,246,685 ratings)
    Wyoming 10-core (427,808 reviews) ratings only (1,141,421 ratings)

    Data format

    Format is one-review-per-line in json. See examples below for further help reading the data.

    Sample review:

    { 'user_id': '106533466896145407182', 'name': 'Amy VG', 'time': 1568748357166, 'rating': 5, 'text': "I can't say I've ever been excited about a dentist visit before, but there's a first for everything! Loved my experience at Lush today. Every person in the office was friendly and personable- plus the office itself is gorgeous! Great experience, I highly recommend!", 'pics': [ { 'url': ['https://lh5.googleusercontent.com/p/AF1QipMBzN4BJV9YCObcw_ifNzFPm-u38hO3oimOA8Fb=w150-h150-k-no-p'] }, { 'url': ['https://lh5.googleusercontent.com/p/AF1QipNS1PEXEvadfUlhRkRDJ09id Mxh3CveZGZYuTo5=w150-h150-k-no-p'] } ], 'resp': { 'time': 1568770503975, 'text': 'We love getting to meet new patients like yourself. Thanks for giving our office a chance to take care of your dental needs and thanks for the nice review!' }, 'gmap_id': '0x87ec2394c2cd9d2d:0xd1119cfbee0da6f3' }
    { 'user_id': '101463350189962023774', 'name': 'Jordan Adams', 'time': 1627750414677, 'rating': 5, 'text': 'Cool place, great people, awesome dentist!', 'pics': [ { 'url': ['https://lh5.googleusercontent.com/p/AF1QipNq2nZC5TH4_M7h5xRAd 61hoTgvY1o9lozABguI=w150-h150-k-no-p'] } ], 'resp': { 'time': 1628455067818, 'text': 'Thank you for your five-star review! -Dr. Blake' }, 'gmap_id': '0x87ec2394c2cd9d2d:0xd1119cfbee0da6f3' }

    where

    Metadata

    Sample metadata:

    { 'name': 'Walgreens Pharmacy', 'address': 'Walgreens Pharmacy, 124 E North St, Kendallville, IN 46755', 'gmap_id': '0x881614ce7c13acbb:0x5c7b18bbf6ec4f7e', 'description': 'Department of the Walgreens chain providing prescription medications & other health-related items.', 'latitude': 41.451859999999996, 'longitude': -85.2666757, 'category': ['Pharmacy'], 'avg_rating': 4.2, 'num_of_reviews': 5, 'price': '$$', 'hours': [['Thursday', '8AM–1:30PM'], ['Friday', '8AM–1:30PM'], ['Saturday', '9AM–1:30PM'], ['Sunday', '10AM–1:30PM'], ['Monday', '8AM–1:30PM'], ['Tuesday', '8AM–1:30PM'], ['Wednesday', '8AM–1:30PM']], 'MISC': { 'Service options': ['Curbside pickup', 'Drive-through', 'In-store pickup', 'In-store shopping'], 'Health & safety': ['Mask required', 'Staff wear masks', 'Staff get temperature checks'], 'Accessibility': ['Wheelchair accessible entrance', 'Wheelchair accessible parking lot'], 'Planning': ['Quick visit'], 'Payments': ['Checks', 'Debit cards'] }, 'state': 'Closes soon ⋅ 1:30PM ⋅ Reopens 2PM', 'relative_results': ['0x881614cd49e4fa33:0x2d507c24ff4f1c74', '0x8816145bf5141c89:0x535c1d605109f94b', '0x881614cda24cc591:0xca426e3a9b826432', '0x88162894d98b91ef:0xd139b34de70d3e03', '0x881615400b5e57f9:0xc56d17dbe420a67f'], 'url': 'https://www.google.com/maps/place//data=!4m2!3m1!1s0x881614ce7c13acb b:0x5c7b18bbf6ec4f7e?authuser=-1&hl=en&gl=us' }

    where

    Code

    Reading the data

    Data can be treated as python dictionary objects. A simple script to read any of the above the data is as follows:

    def parse(path): g = gzip.open(path, 'r') for l in g: yield json.loads(l)