goodbooks-10k
Creators
Description
Description
The dataset contains six million ratings for ten thousand most popular books (with most ratings). There are also books marked to read by the users, book metadata (author, year, etc.) and tags/shelves/genres.
ratings contains ratings sorted by time. Ratings go from one to five. Both book IDs and user IDs are contiguous. For books, they are 1-10000, for users, 1-53424.
to_read provides IDs of the books marked "to read" by each user, as user_id,book_id pairs, sorted by time. There are close to a million pairs.
books has metadata for each book (goodreads IDs, authors, title, average rating, etc.). The metadata have been extracted from goodreads XML files.
book_tags contains tags/shelves/genres assigned by users to books. Tags in this file are represented by their IDs. They are sorted by goodreads_book_id ascending and count descending.
The date set is 68.8 MB large.
Files
book_tags.txt__100lines.txt
Files
(36.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:bdb375d91ace3f2f5dfee7ff2e2191fb
|
1.2 kB | Preview Download |
|
md5:1c4787b7314e0793b84f31f21de73349
|
32.1 kB | Preview Download |
|
md5:25960be4d06a8a2a4cfe77db4d2258ae
|
815 Bytes | Preview Download |
|
md5:72a2e84bb66eccf723ba6bdda6b949f6
|
1.4 kB | Preview Download |
|
md5:09b518bd43125954e7a5801060da357d
|
838 Bytes | Preview Download |
Details
| Resource type | Open dataset |
| Title | goodbooks-10k |
| Creators |
|
| Research Fields | Business Administration Economics Psychology Sociology Political Science Economic & Social History Communication Sciences Educational Research Other |
| Size | 68.8 MB |
| Formats | Comma-separated values (CSV) (.csv) |
| License(s) | Creative Commons Attribution Share Alike 4.0 International |
| External Resource | https://github.com/zygmuntz/goodbooks-10k |