Public datasets
Even if you don’t have your own dataset yet, you can still get started right away. MOSTLY AI hosts a repository of public datasets which can be used to explore the platform, other users can create their own datasets and make them public, giving other MOSTLY AI users access to those datasets, and MOSTLY AI provides several datasets out of the box on the platform.
US Census Income dataset
This dataset is taken from the Adult Dataset from UC Irvine’s Machine Learning Repository.
It is an extraction from the 1994 US Census database and contains 48,842 records and 13 columns of data, with a mix of data types.
Click here to download the .csv.gz
file.
age workclass fnlwgt education marital-status occupation ... race sex hours-per-week native-country capital income
0 39 State-gov 77516 Bachelors Never-married Adm-clerical ... White Male 40 United-States 2174 <=50K
1 50 Self-emp-not-inc 83311 Bachelors Married-civ-spouse Exec-managerial ... White Male 13 United-States 0 <=50K
2 38 Private 215646 HS-grad Divorced Handlers-cleaners ... White Male 40 United-States 0 <=50K
3 53 Private 234721 11th Married-civ-spouse Handlers-cleaners ... Black Male 40 United-States 0 <=50K
4 28 Private 338409 Bachelors Married-civ-spouse Prof-specialty ... Black Female 40 Cuba 0 <=50K
... ... ... ... ... ... ... ... ... ... ... ... ... ...
48837 39 Private 215419 Bachelors Divorced Prof-specialty ... White Female 36 United-States 0 <=50K
48838 64 ? 321403 HS-grad Widowed ? ... Black Male 40 United-States 0 <=50K
48839 38 Private 374983 Bachelors Married-civ-spouse Prof-specialty ... White Male 50 United-States 0 <=50K
48840 44 Private 83891 Bachelors Divorced Adm-clerical ... Asian-Pac-Islander Male 40 United-States 5455 <=50K
48841 35 Self-emp-inc 182148 Bachelors Married-civ-spouse Exec-managerial ... White Male 60 United-States 0 >50K
Baseball dataset
This dataset is taken from the Sean Lahman Baseball Database.
Click here to download the .zip
file. It includes the players.csv
and seasons.csv
files.
id country birthDate deathDate nameFirst nameLast weight height bats throws
0 00020a493f3b P.R. 1993-02-10 NaN Jorge Lopez 195.0 75.0 R R
1 000492168bd5 USA 1945-10-12 1970-12-14 Herman Hill 190.0 74.0 L R
2 0007b3925736 USA 1890-12-24 1956-09-12 Tod Sloan 175.0 72.0 L R
3 000f9b5832e6 USA 1886-03-06 1948-05-26 Bill Sweeney 175.0 71.0 R R
4 00148e917757 USA 1959-09-10 NaN Bruce Robbins 190.0 73.0 L L
... ... ... ... ... ... ... ... ... ... ...
18995 fff2e8e0ccff P.R. 1953-04-02 NaN Hector Cruz 170.0 71.0 R R
18996 fff3d8297c46 USA 1917-05-19 1993-06-07 Skippy Roberge 185.0 71.0 R R
18997 fff913eb4437 Panama 1976-06-20 NaN Carlos Lee 270.0 74.0 R R
18998 fffa11996763 Cuba 1965-10-11 NaN Orlando Hernandez 210.0 74.0 R R
18999 fffa80049d40 P.R. 1990-02-18 NaN Joe Colon 180.0 72.0 R R
players_id year team league G AB R H HR RBI SB CS BB SO
0 00020a493f3b 2015 MIL NL 2 2 0 0 0 0.0 0.0 0.0 0 2.0
1 00020a493f3b 2017 MIL NL 1 0 0 0 0 0.0 0.0 0.0 0 0.0
2 00020a493f3b 2018 MIL NL 10 2 1 1 0 2.0 0.0 0.0 0 1.0
3 00020a493f3b 2018 KCA AL 7 0 0 0 0 0.0 0.0 0.0 0 0.0
4 000492168bd5 1969 MIN AL 16 2 4 0 0 0.0 1.0 2.0 0 1.0
... ... ... ... ... .. .. .. .. .. ... ... ... .. ...
103573 fffa11996763 2005 CHA AL 24 3 0 1 0 0.0 0.0 0.0 0 1.0
103574 fffa11996763 2006 ARI NL 9 11 0 3 0 0.0 0.0 0.0 1 0.0
103575 fffa11996763 2006 NYN NL 20 35 4 5 0 2.0 1.0 0.0 0 10.0
103576 fffa11996763 2007 NYN NL 28 48 1 8 0 3.0 2.0 0.0 0 18.0
103577 fffa80049d40 2016 CLE AL 11 0 0 0 0 0.0 0.0 0.0 0 0.0
CDNOW dataset
This dataset contains a CRM table and the entire purchase history up to the end of June 1998 of 23,570 customers who made their first-ever purchase at CDNOW in the first quarter of 1997.
Click here to download the the .csv.gz
file.
first_name last_name state gender birthdate
0 Bobby Thompson Oregon M 1972-07-19
1 John Wood New Jersey M 1962-02-08
2 Michael Griffith Minnesota M 1981-03-22
3 Eric Walker Michigan M 1942-10-07
4 Austin Levine New Jersey M 1952-05-23
... ... ... ... ... ...
23565 Luis Braun Florida M 1954-05-09
23566 Nicholas Aguilar Indiana M 1950-10-01
23567 Alison Larson New Jersey F 1954-06-08
23568 Joseph Cook Utah M 1935-06-11
23569 Debbie Zamora Illinois F 1977-06-02
Click here to download the .zip
file. It includes the customers.csv
and purchases.csv
tables.
id zone state gender age_category age
0 1 Pacific Oregon M young 26
1 2 Eastern New Jersey M medium 36
2 3 Central Minnesota M young 17
3 4 Eastern Michigan M medium 56
4 5 Eastern New Jersey M medium 46
... ... ... ... ... ... ...
23565 23566 Eastern Florida M medium 44
23566 23567 Eastern Indiana M medium 48
23567 23568 Eastern New Jersey F medium 44
23568 23569 Mountain Utah M old 63
23569 23570 Central Illinois F young 21
users_id date cds amt
0 1 1997-01-01 1 11.77
1 2 1997-01-12 1 12.00
2 2 1997-01-12 5 77.00
3 3 1997-01-02 2 20.76
4 3 1997-03-30 2 20.76
... ... ... ... ...
69654 23568 1997-04-05 4 83.74
69655 23568 1997-04-22 1 14.99
69656 23569 1997-03-25 2 25.74
69657 23570 1997-03-25 3 51.12
69658 23570 1997-03-26 2 42.96
Netflix Prize dataset
This sequence dataset is an excerpt from the original Netflix Prize dataset. It contains 500,000+ ratings from 10,000 users.
Click here to download the .zip
file.
id
0 495
1 840
2 1374
3 1522
4 1619
... ...
9995 2648416
9996 2648568
9997 2648678
9998 2648907
9999 2649207
users_id date movie rating
0 495 2003-10-08 A Mighty Wind 4
1 495 2003-10-24 On the Beach 4
2 495 2003-11-17 Seven Samurai 5
3 495 2003-11-26 Midnight Cowboy 4
4 495 2003-12-04 Yojimbo 5
... ... ... ... ...
501283 2649207 2005-02-08 Napoleon Dynamite 5
501284 2649207 2005-02-08 The Importance of Being Earnest 4
501285 2649207 2005-06-08 Friday Night Lights 2
501286 2649207 2005-06-16 The Hitchhiker's Guide to the Galaxy 1
501287 2649207 2005-08-14 Ray 3
Meteostat weather dataset
This dataset provides global historical weather observations and 30-year climate normals, including daily temperature, precipitation, wind, and sunshine data from 1948 to 2025. There are stations all over the world.
Click here to use it on the MOSLTY AI Platform.
Use the stations
table to identify which station you’d like to use:
station_name country region wmo_id icao_id lat lon elev_m hourly_from hourly_to daily_from daily_to monthly_from monthly_to
0 Holden Agdm CA AB 71227 CXHD 53.1900 -112.2500 688.0 2020-01-01 2024-12-07 2002-11-01 2024-03-13 2003-01-01 2022-01-01
1 Athabasca 1 CA AB <NA> <NA> 54.7200 -113.2900 515.0 NaT NaT 2000-01-01 2022-07-12 2000-01-01 2010-01-01
2 Jan Mayen NO <NA> 01001 ENJA 70.9333 -8.6667 10.0 1931-01-01 2025-03-20 1921-12-31 2025-07-21 1922-01-01 2022-01-01
3 Grahuken NO SJ 01002 <NA> 79.7833 14.4667 0.0 1986-11-09 2025-03-20 2010-10-07 2020-08-17 NaT NaT
4 Hornsund NO <NA> 01003 <NA> 77.0000 15.5000 10.0 1985-06-01 2025-03-20 2009-11-26 2020-08-31 2016-01-01 2017-01-01
... ... ... ... ... ... ... ... ...
4995 Jharsuguda IN OR 42886 VEJH 21.9167 84.0833 228.0 1957-01-02 2025-06-27 1949-08-11 2025-07-23 1949-01-01 2021-01-01
4996 Keongjhargarh IN OR 42891 <NA> 21.6167 85.5167 461.0 2001-09-17 2025-06-15 NaT NaT NaT NaT
4997 Baripada IN OR 42894 <NA> 21.9333 86.7667 53.0 2009-02-05 2025-06-11 NaT NaT NaT NaT
4998 Balasore IN OR 42895 <NA> 21.5167 86.9333 18.0 1944-01-01 2025-06-15 1901-01-01 2025-07-18 1901-01-01 2021-01-01
4999 Contai IN WB 42900 <NA> 21.7833 87.7500 10.0 2002-03-14 2025-06-15 NaT NaT NaT NaT
station country date avg_temp_c min_temp_c max_temp_c precip_mm pressure_hpa
0 London Heathrow GB 2022-07-30 21.2 15.9 27.0 0.0 1018.3
1 London Heathrow GB 2022-07-31 22.4 19.8 27.6 0.0 1016.2
2 London Heathrow GB 2022-08-01 21.9 18.2 27.8 0.0 1018.0
3 London Heathrow GB 2022-08-02 21.7 18.3 26.9 0.0 1015.3
4 London Heathrow GB 2022-08-03 21.5 17.8 27.2 0.0 1013.5
... ... ... ... ... ... ... ... ...
1090 London Heathrow GB 2025-07-24 18.2 15.4 20.5 0.2 1017.5
1091 London Heathrow GB 2025-07-25 21.6 16.1 26.3 0.0 1018.9
1092 London Heathrow GB 2025-07-26 20.5 16.5 24.7 0.0 1017.4
1093 London Heathrow GB 2025-07-27 19.2 15.4 23.1 0.2 1018.2
1094 London Heathrow GB 2025-07-28 19.3 14.5 23.9 0.0 1020.6
Yahoo financial market data
This dataset provides daily historical financial market data, including closing prices, for global equities and indices via Yahoo! Finance’s API.
Click here to use it on the MOSLTLY AI platform.
date GOOGL NVDA AAPL MSFT AMZN META TSLA
2021-01-04 85.79 13.08 126.24 209.62 159.33 267.47 243.26
2021-01-05 86.48 13.37 127.80 209.82 160.93 269.49 245.04
2021-01-06 85.63 12.58 123.50 204.38 156.92 261.87 251.99
2021-01-07 88.19 13.31 127.71 210.19 158.11 267.27 272.01
2021-01-08 89.36 13.24 128.82 211.48 159.13 266.11 293.34
... ... ... ... ... ... ... ...
2025-07-22 190.10 171.38 212.48 510.06 229.30 712.97 328.49
2025-07-23 191.34 167.03 214.40 505.27 227.47 704.81 332.11
2025-07-24 190.23 170.78 214.15 505.87 228.29 713.58 332.56
2025-07-25 192.17 173.74 213.76 510.88 232.23 714.80 305.30
2025-07-26 193.18 173.50 213.88 513.71 231.44 712.68 316.06