Data Science Altitude for This Article: Camp Two. As we concluded our last post, we used Azure ML Studio to attach a Logistic regression model to some data for a hypothetical electrical grid (11 predictor variables). We had a first look at the output which scores whether our response variable as adequately predicted, and promised a dive into the scoring. We’ll get to that, but first I’d like to put together another, competitive model to compare and contrast to our Logistic regression efforts.
Data Science Altitude for This Article: Camp Two. Splitting the Data Azure ML Studio gives us a handy way to split our data and provides some alternatives in doing so. I’ll mark 80% of our data for use in training the model, and 20% to use in scoring it with an eye towards which model is better when encountering new data. With our split of categories to predict being roughly 60/40, there’s little to gain from ensuring that the division of data into 80% train / 20% test keeps to a consistent 60/40 split along those category percentages.
Data Science Altitude for This Article: Camp Two. Our prior posts set the stage for access to MS Azure’s ML Studio and got us rolling on data loading, problem definition and the initial stages of Exploratory Data Analysis (EDA). Let’s finish off the EDA phase so that in our next post, we can get to evaluating the first of two models we’ll use to forecast stability - or the lack thereof - for a hypothetical electrical grid.
Data Science Altitude for This Article: Camp Two. We left our prior post with covering what’s involved for you to set up a Microsoft Account, a survey of your access options (Quick Evaluation / Most Popular / Enterprise Grade), and a brief and basic tour of the Azure ML Studio Interface. We’re going to head there shortly and get a big whiff of the drag-and-drop catnip, but first, let’s discuss the particular problem I’d like to throw at it and set the stage for the next several posts.
Data Science Altitude for This Article: Camp Two. While working my way through the code in Wei-Meng Lee’s excellent Python Machine Learning, I ran into Chapter 11, titled Using Azure Machine Learning Studio. Sporting a drag-and-drop interface, you can tackle many Data Science problems without the need to write a line of code. You’re not absolved from knowing why you’re doing what you’re doing, of course, but you can try out some problem-solving proofs-of-concept in short order.
Data Science Altitude for This Article: Base Camp. In the prior two posts, we first took the temperature of the R and RStudio user community and then we installed R. Now we’re ready to download the RStudio installer executable file and take a look at some of the functionality inside the IDE. The RStudio Product Page However, before we do that, please take note of the RStudio product page.
Data Science Altitude for This Article: Base Camp. Today’s post goes into the ‘meat and potatoes’ of the install process for R. If you look to get more involved in a data science career or a more classic application development job in Java, .NET, or something similar, it’s important to develop a comfort level with installing software on your own machine. You’ll have to do it repeatedly over your career.
Data Science Altitude for This Article: Sea Level. Today’s post is for those of you that are looking to install R and RStudio on your home machine for the first time, and for those that don’t (yet) have a comfort level with installing software in general. I’ll provide links to others’ guides and documentation that provides a lot of detail; I could try that out myself, but I’m not here to re-invent someone else’s wheel.