This project is the first to combine the notion of a data repository with real-time visual analytics for interactive data mining and exploratory analysis on the web. State-of-the-art statistical techniques are combined with real-time data visualization giving the ability for researchers to seamlessly find, explore, understand, and discover key insights in a large number of public donated data sets. This large comprehensive collection of data is useful for making significant research findings as well as benchmark data sets for a wide variety of applications and domains and includes relational, attributed, heterogeneous, streaming, spatial, and time series data as well as non-relational machine learning data. All data sets are easily downloaded into a standard consistent format. We also have built a multi-level interactive visual analytics engine that allows users to visualize and interactively explore the data in a free-flowing manner.

MLvis.com - An interactive visual data exploration repository


Our vision

Scientific progress depends on standard datasets for which claims, hypotheses, and algorithms can be compared and evaluated. Despite the importance of having standard datasets, it is often impossible to find the original data used in published experiments, and at best it is difficult and time consuming. This site is an effort to improve and facilitate the scientific research community to share, find, and interactive visualize and explore the data and its frequent patterns/trends, as well as outliers and anomalies. We are the first data repository to combine visual analytics with state-of-the-art statistical techniques to allow researchers to seamlessly explore, compare, and analyze the data in real-time on the web. It makes it easy for researchers to download, compare results and findings from papers, as well as analyze and compare hundreds of data from a variety of different collections and domains. Our goal is to make these scientific data widely available to everyone while also providing a first attempt at interactive analytics on the web.

  • Network Repository (NR) is the first interactive data repository with a web-based platform for visual interactive analytics. Unlike other data repositories (e.g., UCI ML Data Repository, and SNAP), the network data repository (networkrepository.com) allows users to not only download, but to interactively analyze and visualize such data using our web-based interactive graph analytics platform. Users can in real-time analyze, visualize, compare, and explore data along many different dimensions. The aim of NR is to make it easy to discover key insights into the data extremely fast with little effort while also providing a medium for users to share data, visualizations, and insights. Other key factors that differentiate NR from the current data repositories is the number of graph datasets, their size, and variety. While other data repositories are static, they also lack a means for users to collaboratively discuss a particular dataset, corrections, or challenges with using the data for certain applications. In contrast, we have incorporated many social and collaborative aspects into NR in hopes of further facilitating scientific research (e.g., users can discuss each graph, post observations, visualizations, etc.).