Skip to content

nuhorchak/RClean

Repository files navigation

RClean

Nicholas Uhorchak 2018-03-05

Build Status AppVeyor Build Status

Section 1 Basic Information

1.1 Name

RClean

1.2 Title

RCleaner, an interactive data cleaning tool provides users the dynamic ability to import and clean data. At its core, it provides R users functionality similar to that of Microsoft Excel with regards to preparation of a dataset for analysis.

1.3 Description

1.3A Features

Utilizing R to import and clean data is often a time consuming task. Without preparation of the dataset in excel or other software, R users must use scripts or command line R code for this task. The Interactive Data Cleaning tool will afford users the ability to do the following:

  • Initialize RCleaner gadget with a dataset
  • Visually inspect the dataset called into the gadget
  • Select data columns to remove
  • Select data rows to remove
  • Provide the ability to rename columns in the dataframe
  • Provide the option to scale the data.
  • Provide the option to mean center the data
  • Provide the ability to encode nominal data to numerical data

1.3B End users

This analytic is being developed for those users in need of hasty data cleaning or those who would otherwise not wish to spend a large amount of time writing code to prepare data for analysis. Typical users will have working knowledge of R, however prefer the point and click abilities of Microsoft Excel or other similar software.

1.3C Required knowledge/skills

Users must be able to navigate R studio and understand how to use an R Gadget. In addition, they should be aware of the types of data contained in the dataset to be analyzed, whether numerical or categorical, such that they are aware of the application of some functions of this analytic tool.

1.3D Statistical methods utilized

  • Mean center data
  • Scale Data
  • Generate indicator variables

1.3E R Packages utilized

This analytic will utilize the following existing R packages:

  • shiny
  • DT
  • shinythemes
  • markdown
  • dummies

1.4 End user access

End users will call this gadget from the associated R package

1.5 Security concerns

None

1.6 Design constraints

Currently, the gadget only handles DF, matrix or tibble like objects with 2 or more columns. Single vectors are not handled.

Section 2 Delivery and Schedule Information

2.1 Feature Review

Feature Description Rank Status Value to user Inputs Outputs Use? Time? Current or future version
Visual inspection of data This feature will open the newly imported DF so the user can look at the data 1 COMPLETE Quick and easy visual exploration of the dataset imported Some dataset Dataset output onto screen Visual exporation of data Yes Current
Select releveant data columns to retain/remove Allow the user to select what columns to either retain or remove from the current data 2 COMPLETE Easily remove unwanted variables from the dataset button click Modified DF Data cleaning Yes Current
Select releveant data rows to retain/remove Allow the user to select what rows to either retain or remove from the current data 3 COMPLETE Easily remove unwanted rows from the dataset button click Modified DF Data cleaning Yes Current
Save clean data User can save the "clean" data to a new dataframe in R 4 COMPLETE Cleaned data saved for analysis new name for clean DF Clean DF Save cleaned DF for future use Yes Current
Scale Data Allow the user to scale the data 5 COMPLETE Scale the data for future use button click Modified DF Data prep No Current
Mean center data Allow the user to center the data 6 COMPLETE Mean center the data for future use button click Modified DF Data prep No Current
Rename columns Allow the user to rename columns in the DF 7 COMPLETE Rename columns if necessary Column names if necessary Modified DF Data cleaning No Future
Create indicator variables Allow the user to create "dummy" variables to represent nominal data 8 COMPLETE Create indicator variables Variables to encode Modified DF Data prep No Future
Write "clean" data to excel Allow user to write the clean data to new excel file 9 COMPLETE Clean data is saved into external file for future use file location excel document save file as excel doc for future use No Future
Modify DF cells Allow users to click on a cell and change data values 10 not started single cell value modification N/A modified DF change cells No Future
Impute missing values Allow the user to impute missing values 11 not started NA Method of imputation Modified DF Data prep No Future

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages