This is an example program that visualizes numerical and categorical attributes of a dataset. It's currently set up on a small example dataset of soccer players. However, with some modifications it can visualize any dataset of homogenous objects where attributes are categorical or numerical.
- Ensure you have Python, NPM, Typescript, and Node.js on your computer
- Clone the repo
- Either create a virtual Python env in the
/backendfolder or use your global environment. Either way, you need Flask installed in the environment that you'll be using-pip install flaskshould work for most Python installs. - Install Node dependencies in
/frontendby runningnpm installin that directory. This should populate a/frontend/node_modulesfolder with the necessary dependencies, which are located inpackage.json. - Run the backend by opening your python environment with Flask and executing
flask runin the/backenddirectory. Ensure port5000is open and accessible as the backend runs on this port. - Run the frontend by executing
npm startin the/frontenddirectory. Ensure port3000is open and accessible as the frontend runs on this port. - If it doesn't open automatically, go to
http://localhost:3000in your browser to view the data visualization.
When localhost:3000 is opened in a browser with the frontend and backend running, it displays a GUI with a table on the left and visualizations on the right.
- A dropdown menu on the top allows attributes to be selected and deselected. Selecting an attribute adds a column for it to the table and adds a visualization for it on the right (unless it falls into a small subset of attributes which aren't visualized because they aren't categorical or numerical).
- Clicking the header of any column sorts the entire table by that attribute. Clicking again reverses the sort order.
- Clicking a player row highlights their position in all data visualizations on the right. For numerical attributes, a vertical line indicates where the player falls in the distribution. For categorical attributes, the category of the selected player is shaded a different color.
The server also handles API calls to the /api endpoint, returning results as JSON. These calls can be made to localhost:3000, in which case they are proxied to localhost:5000, or made directly to localhost:5000. All requests are URL-decoded, so + will convert to space, etc.
Endpoints:
/api: Index endpoint; returns a list of all players and their attribute values/api/player/<name>: Returns the details of a single player by name. Name is case-insensitive but diacritics are currently matched./api/country/<country>: Returns a list of all players and their attribute values from a given country./api/club/<club>: Returns a list of all players and their attribute values from a given club./api/attributes: Returns a list of all attributes that players have.
I'm hoping to have more time to improve this project, as I really enjoyed writing it and have more ideas to improve it. Current ideas:
- Dark mode!
- Make sure density visualizations don't go below the minimum possible value (0)
- Check for and uninstall unused NPM dependencies
- Ensure all functions/constants are documented
- Add
IPlayerinterface so that TS can typecheck Player objects - Abstract table code from
App.tsxto its own file - Move ALL constants to
Constants.tsx - Deselect player on 2nd click
- Visualizations in same order as table
- Quantized scrolling of visualizations
- Arrows to indicate sorting in
<th>elements - Minus icon in
<th>to remove column directly from table - Abstract interfaces and data processing so that any data type can be plugged in as long as a corresponding interface is defined for it.