Menggy Lab Menggy Lab: Internet Data Collection and Extraction


Modern Internet has massive data, which caught many academic researchers' attention. However, how to handle big data (collect, extract, clease, format and organize) becomes an urgent problem, especially for those who are not major in C.S.. Collect data manually is too time-consuming; Buy data from a third-party company is too expensive, many projects have to be stopped before it even gets started.
Menggy Lab aims to provide data support for academic researchers, universities or other academic institutions for FREE.

Data Explore          Customize a Dataset

Diven by Data

“Future world will be driven by the Data rather than Petroleum.” - Jack Ma

How much BIG the Internet is

It is estimated that the size of the data that Four Giants owned exceeds 1200PB, i.e., 120w terabytes

The value of the data

“By measuring the entropy of each individual's trajectory, we find a 93% potential predictability in user mobility across the whole user base.” -
Science 19 February 2010

The size of data they collected

Google

5,000 PB

Baidu

2,000 PB

Facebook

300 PB

eBay

90 PB

Menggy Lab collected

Datasets

66

Records

25,419,764

Size

6.19 GB

Web Page

17,603,230

Data Explore

Git Hub

The GitHub 博客数据 - github.com/blog
Update:2016/06/22 - Create:2016/06/22

Records:694 | Variables:10 | Size:1.22 MB | Downloads:0 | Views:63
Allen - 8天前
心食谱

心食谱所有食谱数据 - www.xinshipu.com
Update:2016/06/21 - Create:2016/06/20

Records:69,141 | Variables:14 | Size:62.25 MB | Downloads:0 | Views:42
Allen - 9天前
crowdSPRING

LOGO Projects 用户参与数据 - crowdSPRING - crowdspring.com
Update:2016/06/07 - Create:2016/06/07

Records:35,862 | Variables:10 | Size:5.23 MB | Downloads:0 | Views:152
Allen - 23天前
crowdSPRING

LOGO Projects Activity数据 - crowdSPRING - crowdspring.com
Update:2016/06/01 - Create:2016/05/31

Records:2,446,783 | Variables:9 | Size:717.43 MB | Downloads:0 | Views:102
Allen - 1月前
crowdSPRING

LOGO Projects 用户信息数据 - crowdSPRING - crowdspring.com
Update:2016/06/07 - Create:2016/05/31

Records:51,088 | Variables:14 | Size:12.51 MB | Downloads:0 | Views:142
Allen - 1月前
crowdSPRING

LOGO Projects 统计数据 - crowdSPRING - crowdspring.com
Update:2016/06/01 - Create:2016/05/30

Records:28,400 | Variables:26 | Size:4.47 MB | Downloads:2 | Views:50
Allen - 1月前
crowdSPRING

LOGO Projects Detail数据 - crowdSPRING - crowdspring.com
Update:2016/06/01 - Create:2016/05/30

Records:29,970 | Variables:8 | Size:4.13 MB | Downloads:1 | Views:128
Allen - 1月前
crowdSPRING

LOGO Projects 项目基本数据 - crowdSPRING - crowdspring.com
Update:2016/06/01 - Create:2016/05/27

Records:28,400 | Variables:11 | Size:4.2 MB | Downloads:0 | Views:49
Allen - 1月前
Privacy Policy TripAdvisor

tripadvisor 猫途鹰北京景点数据 - tripadvisor.com
Update:2016/06/20 - Create:2016/05/17

Records:1,507 | Variables:15 | Size:663.26 KB | Downloads:0 | Views:91
Allen - 1月前
Privacy Policy TripAdvisor

tripadvisor 猫途鹰北京餐馆数据 - tripadvisor.com
Update:2016/06/20 - Create:2016/05/16

Records:8,668 | Variables:15 | Size:3.41 MB | Downloads:0 | Views:90
Allen - 1月前