Bubble Sort and Project Euler Problem 1

As someone who is about to graduate and in the midst of interviewing for coding jobs I have found that I need to work on more problem solving skills in coding.  So I have decided to challenge myself and work on a few daily problems to enhance my skills a bit.

For student coders that aren’t aware there is a great website that helps you to develop your problem solving abilities called Project Euler (if you haven’t done problem one yet and don’t want to know the answer then stop reading this now).  It is a great site to develop skills and track them against your peers.

So here is the solution to problem 1:

Scanner reader = new Scanner(System.in);
System.out.println(“What number would you like to determine the sums of all the multiples of 3 or 5?”);
int n = reader.nextInt();

int factorOfThree = 0;
int factorOfFive = 0;
int sum = 0;

Array[] number = new Array[n];

for (int i = 0; i < number.length; i ++)
{
if (i % 3 == 0 || i % 5 == 0)
{
if (i % 3 == 0)
{
factorOfThree = i + factorOfThree;
}
else
{
factorOfFive = i + factorOfFive;
}
}

}

sum = factorOfThree + factorOfFive;

System.out.println(sum);

The solution is not perfect but I left my original answer here to show to students that your first solution is likely not the finished one.

I also figured out how to do a BubbleSort.  You can do this with a do-while loop as well but I am more comfortable with the for loop and that’s how I made it work for me.

int[] sortedArray = new int [10];
Random randomGenerator = new Random();

for (int idx = 1; idx <= 10; ++idx)
{
for (int i = 0; i < sortedArray.length; i ++)
{
int randomInt = randomGenerator.nextInt(100);
sortedArray[i] = randomInt;
}

}
System.out.println(Arrays.toString(sortedArray));

int n = sortedArray.length;
int temp = 0;
for (int k =0; k < n; k++ ){
for (int j = 1; j < n; j++){
//do {
if (sortedArray[j-1] > sortedArray[j]){
temp = sortedArray [j-1];
sortedArray [j-1] = sortedArray[j];
sortedArray [j] = temp;
}
}
}
System.out.println(Arrays.toString(sortedArray));

Another New Stat, Contact WAM

I have felt that Wins Above Replacement has a very low starting bar for measuring how valuable a player is (starting at around 47 wins for a team of 0 WAR players) so I have been thinking of a way to improve on that.   With Contact WAM (wins above the mean) I have used what is called a z-score to determine how well a person performs with a contact instance at the plate.

The technical explanation of what it does is taking any contact instance resulting in a safe hit (single, double, triple, and a home run) and using how often a player gets that per at bat.  I then take the league mean (average) of the same per at bat number and get a number for each player of how well they performed above or below the league mean by using z-scores for each contact instance.

Here are the results for the Cubs hitters over 200 at bats this year.

Name Contact WAM
Kris Bryant 3.65
Chris Coghlan 3.35
Anthony Rizzo 3.34
Dexter Fowler 3.00
Chris Denorfia 2.03
Starlin Castro 1.47
Jorge Soler 1.44
Kyle Schwarber 0.92
Addison Russell 0.73
Miguel Montero -0.003

I think this determines how well a hitter performs when they make contact compared to the league average.  This is not adjusted for position or for the league they are in.

In the next few days I plan on messing with this more and coming up with a way to reduce theses numbers to include non-contact instances such as walks and strikeouts, and contact instances resulting in an out.  The goal is to try to start with an 81 win starting point for a team and seeing if this stat can approach a teams actual win result by using z-scores away from the mean.

Right now the total Contact WAM is just under 20, which would put the Cubs at 101 wins (81 +20), which is obviously too high so a correction is in order to include the outs, which I will have to figure out how to do soon.

Functional Loan Calculator

I am trying to work on projects that I am not that good at, so that I could become a better Java programmer.  So I decided to work on a GUI to create a loan calculator that determines how many months it would take to pay off a loan.  What I built here is very basic and doesn’t look pretty as a GUI, but it works!
When it comes to building the GUI I wasn’t concerned with coding that part myself as I am not concerned with design, and I could just use the NetBeans GUI creator to do it for me.  What I wanted practice doing was using inner classes and in using more algorithms.  If you are a coder and are reading this feel free to leave comments on any improvements you can see.

package LoanCalc;

import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;

import javax.swing.JButton;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JPanel;
import javax.swing.JTextField;

public class Practice {

public static void main(String [] args){

BuildGUI gui = new BuildGUI();
gui.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
gui.setVisible(true);

}
}

class BuildGUI extends JFrame implements ActionListener{
private JLabel[] l;
private JTextField []t;
private JTextField k;
private JPanel p;
private JButton b;
private String []Labels={“Debt Type”, “Total Owed”, “Interest Rate”, “Monthly Payment”, “Total Months”};

public BuildGUI(){

t=new JTextField[5];
l=new JLabel[5];
p=new JPanel();
b=new JButton();

for(int i=0;i<Labels.length;i++){
t[i]=new JTextField(20);
l[i]=new JLabel(Labels[i]);
p.add(l[i]);
p.add( t[i] );
//t[i].addActionListener(this);

}
b.setText(“Calculate”);
k=new JTextField();
b.addActionListener(this);

javax.swing.GroupLayout jPanel1Layout = new javax.swing.GroupLayout(p);
p.setLayout(jPanel1Layout);
jPanel1Layout.setHorizontalGroup(
jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(javax.swing.GroupLayout.Alignment.TRAILING, jPanel1Layout.createSequentialGroup()
.addContainerGap(javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE)
.addComponent(b)
.addGap(104, 104, 104))
.addGroup(jPanel1Layout.createSequentialGroup()
.addGap(18, 18, 18)
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel1Layout.createSequentialGroup()
.addGap(28, 28, 28)
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING)
.addComponent(l[1])
.addComponent(l[0]))
.addGap(68, 68, 68)
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING, false)
.addComponent(t[0], javax.swing.GroupLayout.DEFAULT_SIZE, 138, Short.MAX_VALUE)
.addComponent(t[1])))
.addGroup(jPanel1Layout.createSequentialGroup()
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING)
.addComponent(l[2])
.addComponent(l[3]))
.addComponent(l[4]))
.addGap(40, 40, 40)
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addComponent(t[4], javax.swing.GroupLayout.PREFERRED_SIZE, 139, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(t[2], javax.swing.GroupLayout.PREFERRED_SIZE, 139, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(t[3], javax.swing.GroupLayout.PREFERRED_SIZE, 138, javax.swing.GroupLayout.PREFERRED_SIZE))))
.addContainerGap(62, Short.MAX_VALUE))
);
jPanel1Layout.setVerticalGroup(
jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel1Layout.createSequentialGroup()
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(jPanel1Layout.createSequentialGroup()
.addGap(18, 18, 18)
.addComponent(t[0], javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE))
.addGroup(javax.swing.GroupLayout.Alignment.TRAILING, jPanel1Layout.createSequentialGroup()
.addContainerGap()
.addComponent(l[0])))
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED)
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addComponent(l[1])
.addComponent(t[1], javax.swing.GroupLayout.PREFERRED_SIZE, 20, javax.swing.GroupLayout.PREFERRED_SIZE))
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED)
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASELINE)
.addComponent(t[3], javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(l[2]))
.addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.UNRELATED)
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addComponent(t[2], javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addComponent(l[3], javax.swing.GroupLayout.PREFERRED_SIZE, 20, javax.swing.GroupLayout.PREFERRED_SIZE))
.addGap(18, 18, 18)
.addComponent(b)
.addGap(18, 18, 18)
.addGroup(jPanel1Layout.createParallelGroup(javax.swing.GroupLayout.Alignment.BASELINE)
.addComponent(l[4])
.addComponent(t[4], javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE))
.addContainerGap(91, Short.MAX_VALUE))
);

javax.swing.GroupLayout layout = new javax.swing.GroupLayout(getContentPane());
getContentPane().setLayout(layout);
layout.setHorizontalGroup(
layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(layout.createSequentialGroup()
.addComponent(p, javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addGap(0, 67, Short.MAX_VALUE))
);
layout.setVerticalGroup(
layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING)
.addGroup(layout.createSequentialGroup()
.addContainerGap()
.addComponent(p, javax.swing.GroupLayout.PREFERRED_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.PREFERRED_SIZE)
.addContainerGap(60, Short.MAX_VALUE))
);

pack();
}

//setting text fields to floats
String dt;
String to;
String mp;
String i;

float totalOwed;
float monthlyPayment;
float interest;
int months;

public void actionPerformed(ActionEvent e) {

// Execute when button is pressed
//System.out.println(“You clicked the button”);

dt = t[0].getText();
to = t[1].getText();
mp = t[2].getText();
i = t[3].getText();
totalOwed = Float.parseFloat(to);
monthlyPayment = Float.parseFloat(mp);
interest = Float.parseFloat(i);
while (totalOwed > 0){

//for loop simulates a year
for(int i=0; i<=11; i++){
//break from loop if paid off!
if (totalOwed <= 0){
break;
}
totalOwed -= monthlyPayment;
//counter for months
months ++;
}

//adding interest
totalOwed = totalOwed + (totalOwed*interest);

}

String monthTotal = Integer.toString(months);
t[4].setText(monthTotal);

}

}

Inner Classes, GUI’s, and a loan generator app.

In my attempt to learn as many things as possible this summer to become more employable I am trying to create an app that figures out the total months it would take to pay off a debt or loan given the premium, interest rate, and monthly payment.  I came to this idea because I think it would be a good tool for those that are budgeting like myself and it would be a great way to learn a few things in programming and get more practice.

So I am going to post my progress every Monday.  Right now I have built a template for the inner class build that I am using for the TextFields I am using.  I am going to have to spend this week trying to find a way to store the types of debt (debtName) and how to use the submit button to calculate the total months and have it displayed in a text field.

package DebtManager;

import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;

import javax.swing.JButton;
import javax.swing.JFrame;
import javax.swing.JTextField;

public class Practice extends JFrame{

//button member declaration needed here

JTextField textField = new JTextField();
JButton button1 = new JButton();

class Handler implements ActionListener
{

float b;

@Override
public void actionPerformed(ActionEvent e) {

String a;

try {
if (textField.getText().isEmpty()){
throw new Exception();
}

a = textField.getText();
b = Float.parseFloat(a);
//jButton1ActionPerformed(owed);
}
catch (Exception e1){
System.out.println(b + ” field Left Blank”);
}

}
}

//Define an inner member class for each Text Field
protected void buildGUI()
{

//initialize the text fields

JTextField debtName = new JTextField(10);
JTextField TOwed = new javax.swing.JTextField();
JTextField interest = new javax.swing.JTextField();
JTextField monthly = new javax.swing.JTextField();

//Submit button needs method to do math, and send result to totalMonths
JButton submit = new javax.swing.JButton();

JTextField totalMonths = new javax.swing.JTextField();

//register inner class action listener for each button

//need separate method for debtName to just keep name as String
debtName.addActionListener(new Handler());

TOwed.addActionListener(new Handler());
interest.addActionListener(new Handler());
monthly.addActionListener(new Handler());

//need a separate method in handler for this.
totalMonths.addActionListener(new Handler());

}

}

My New Correlation Stat with no Name

So I am trying to find out which batting instances contribute the most to creating runs.  Right now I am playing around with it just to see what kind of correlation coefficient I get.  Which tries to show how much an instance contributes to a higher ERA, which I am using ERA because it is the best way for me to measure runs minus any errors.  Maybe I will change that method as well as time goes on and I play with this more.

But here is my very first correlation using batting average per year and earned run average per year, league wide.  Surprisingly batting average only has a correlation of .621, which typically a strong correlation wants to be near .90 or higher to be significant.

Now when you plot batting average and earned runs on a graph, with the data normalized they do appear to have a connection, just not significant enough.

The orange line is batting average and blue line is ERA.  Because I am new with Excel graphs and with MSPaint features, this graph looks pretty anemic, and I promise as I do this more often my skills will improve in that regard, but you can get the general idea here.

BAERA My intent after running this through several different statistical categories like batting average, OBA, OPS, Slugging, etc…  Is to find out which instances have the highest correlations to earned runs and to add a weighted number to each instance and come up with a new stat!  We will see how this goes.   Also I am taking ideas on a name for this stat, please leave any suggestions via comments or email.

 

 

Artificial Intelligence

I had to write a paper on Artificial Intelligence and since I don’t like to waste my writing I figured I would also put it on my blog.

Artificial Intelligence

 

There are many different types of jobs one can do if interested in the artificial intelligence field.  I will try to cover the 5 most popular careers and perhaps touch on a few others.  Technically speaking, AI is a field and not yet a discipline, like programming.  AI is built upon a knowledge of mathematics and numerous other sciences, in order to detect, assess and act on information.  Most positions in the field require a masters or a PhD, with a strong emphasis on mathematical capabilities.  That being said the most popular jobs in this area are game programmers, robotic scientists, face recognition software, search engines, and government employees in the defense industry.

 

Game Programming  

http://www.raywenderlich.com/24824/introduction-to-ai-programming-for-games

https://software.intel.com/en-us/articles/designing-artificial-intelligence-for-games-part-1

 

 

In game programming you typically have to battle, compete, defend, strategize or defeat an enemy.  In order to make this enemy difficult and to make the game challenging and fun it has to have some form of intelligence to attempt to outwit you.  Programmers have to create algorithms to have enemies move in a way that counters your moves, or use seeking algorithms to find your position.

 

The average gamer has about 13 years of experience and typically plays a game because it is a challenge, so each new generation of games has to develop better AI in order to capture the attention of gamers.  Most games have a general set of rules that the AI adheres to and sticks to a certain chain of events or a script, so there isn’t much programming needed to go off script or against the determined rules.

 

Most characters in a game are determined by finite states, such as idle, aware, aggressive, alert and fleeing.  These states are activated by the human controlled player, and can be activated by checking for a certain variable.  There is also reliance on predictive algorithms that are used in conjunction with collecting history on a human controlled player to predict the next move.

 

When it comes to AI characters in games they need the ability to perceive and find paths, to give them another level of intelligence.   Sight is done with vectors, distance and angles, and if something is not within a certain angle the character doesn’t react to it.  Sound is done in a similar way as well.  Once you master sound and sight you can have enemies perform tasks such as cover, crash, and turn.  And after they have those tasks you can then create an algorithm to find the best path available to get from a starting position to an end position.

 

When it comes to finding jobs in the gaming industry it is best to move to an area of the country that has a high concentration of these companies.  There was a time when I lived in Austin, TX and noticed that they had a very large amount of these types of jobs comparable to other metropolitan areas.  If ou look on LinkedIn there are about 400 opening countrywide in this field:

https://www.linkedin.com/job/game-programmer-jobs/

 

Robotic Scientists

http://science.howstuffworks.com/robot6.htm

As I write this artificial intelligence in robots is still far away from trying to achieve the intelligence that humans have, with the ability to learn anything, ability to reason, ability to learn languages and formulate original ideas.  But they have made a lot of progress and can replicate some elements of intelligent ability.  The most basic robotic intelligence relies on gathering facts about a situation through sensors and human input.  the computer then compares this information to stored data and decides what the information means.  Then the computer runs through possible actions and decides which action will be the most successful.

 

Modern robotics has the capacity to learn in limited ways.  They can recognize if a certain action was successful or not, and store that information for the next time they attempt a certain action  Some can even learn by mimicking human actions.  M.I.T.’s “Kismet” is a robot that designed to interact socially, by recognizing body language and voice inflection, and it uses lower level computers and lower level actions to mimic basic human automated actions.

 

The major drawback to AI in robotics is trying to mimic the billions of neurons and their connections that lead to higher level learning in humans, this complex circuitry is incomprehensible to AI developers.  Mostly everything is theoretical right now and science is creating theories and testing them to see if they have figured out the secret to this higher level learning.  So far we are pretty far away from the Authur C Clarke’s version of the future where robots and human are so integrated it enables humans to live hundreds of years.

 

When it comes to finding a job in robotics it would appear to be much easier, but is also entails more than just the artificial intelligence aspect which is hard to parse out in a job search.  IN the field of robotics, there are currently over 4000 jobs you could find on LinkedIn:

 

https://www.linkedin.com/job/robotics-jobs/?trk=jserp_search_button_execute

 

Facial Recognition Intelligence

 

This kind of technology really fascinates me as you can have it being used with a pair of glasses and observe a persons face and immediately know their name and have access to their Facebook profile.  No one would ever have to worry about memorizing people’s names anymore, imagine how this would work in sales, marketing and networking circles?  You can immediately appeal to anyone’s interests like Facebook does with their advertising algorithms.

 

Privacy advocates, however, are going to be absolutely against software like this.  Any person can now tell if someone is liberal or conservative, gay or straight, religious affiliation, what sports teams they like, etc…  It could lead to more discrimination or it could bring us closer together as we realize how many things we have in common with another.  Imagine sitting at a bar alone and knowing that the person next to you loves the same sports teams you do, or is a devout Christian?

 

The software is currently almost as efficient as humans in recognizing faces, and is only less effective by slightly more than .25%!  Eventually this tech could surpass human recognition abilities.  This is the software that Facebook will eventually use in tagging people in photos, that is why when you hover over a photo it magically knows who it is, and it is soon about to become even more magical!

 

When searching LinkedIn for this type of job I ended up finding gigs for places like salons, so I thought it best to check other web sites and noticed that on Indeed there was a mere 107 total jobs listed there:

http://www.indeed.com/q-Face-Recognition-Engineer-jobs.html

 

So while the amount of jobs in this industry is very meager right now, you can bet that with its marketing capabilities that it could end up being the fastest growing part of artificial intelligence development out there.

 

Search Engine Intelligence

  • March 2013: Google acquires DNNresearch, a neural network startup out of the University of Toronto, and gets the team refocused on expanding traditional search algorithms.
  • January 2014: Google acquires DeepMind and sets up the artificial intelligence team to work directly with the Knowledge team on Google’s search algorithms. (They also, almost immediately, set up an AI ethics board — presumably, to save the human race from AI-wrought extinction.)
  • September 2014: Google expanded research surrounding quantum computing by hiring John Martinis and his research team out of UCSB.
  • October 2014: Google acqui-hires two teams of AI researchers from Oxford (and announces a partnership with the University) to “enable machines to better understand what users are saying to them.”

I posted the above listed link from here, just to give an example of how far and ahead Google is over anyone else in using search engine capabilities to create artificial intelligence.  Rumor has it that they already have algorithms that can learn and are readjusting as we speak, hence why they have reduced the amount of announcements they make on their own algorithm discoveries.  Part of that may be due to google switching to SEO’s that are geared more towards fact based accuracy over “hit” based accuracy, but unless you are in google’s top brass there really is no way to know.

 

Search Engine Optimization (SEO) still seems to be the place to go if you are entering the intelligence industry right now, but it is still very broad as it could range from simply being a key word specialist to being an actual algorithm developer with Google, so be wary of the results below from LinkedIn as they are likely not representative of actual AI jobs:

https://www.linkedin.com/job/search-engine-ai-jobs/?trk=jserp_search_button_execute

 

Atrificial Intelligence in the defense industry

 

The description below of quantum computing is taken from NASA’s own website here.

“The NAS facility hosts the Quantum Artificial Intelligence Laboratory, a collaborative effort among NASA, Google, and Universities Space Research Association (USRA) to explore the potential for quantum computers to tackle optimization problems that are difficult or impossible for traditional supercomputers to handle.

The laboratory houses a 512-qubit D-Wave Two™ quantum computer, installed at the NAS facility on the NASA Ames campus. The facility has been extensively retrofitted to provide isolation from noise and vibration, as well as the infrastructure required to cool the system to its near-absolute-zero operating temperature. The installation team recently completed a series of rigorous calibration and acceptance tests, and is planned to be operational in early fall 2013.”

Quantum computing is an exciting area of artificial intelligence.  Theorists say that one day you can turn on your computer simply by thinking it or observing it, as quantum computing approaches further development we get further understanding of how this could actually happen.

 

As for government jobs in artificial intelligence they don’t seem to use social media as much as other private sector employers would so the jobs available through LinkedIn are fairly limited at just a few dozen or so.

 

https://www.linkedin.com/job/artificial-intelligence-government-jobs/?trk=jserp_search_button_execute

 

Overall the field of artificial intelligence is an ever evolving ever developing industry in which the approaches to reaching human like intelligence is happening from many different industries and many different theories and methods.

 

And as AI develops and becomes smarter it becomes more likely that more jobs will be taken away just as they are created.

Sports, Stats and Science Introduction and Cubs Opening Day

As my first post I thought it best to introduce what I am trying to bring here that may separate me from the thousands of other sports blogs.  When I was a kid I was a baseball card collector and a stats junkie, and now as an adult I am still a stat junkie and now a computer scientists who can use this data to do things I want to do.  So I am going to combine my love of sports, stats and coding on here and hope it can do something for the reader and maybe get me a front office job for a baseball team or other sports team.

I see the Cubs this year and their stadium in the same stage of their building process.  The Cubs are past the transition point of selling off their bad parts and are now in the process of seeing new shiny rookies coming to the plate and dazzling us.  The Cubs stadium is past the process of the rooftop owners blocking their progress and are now in the process of adding shining new toys as well, like that massive scoreboard and really loud speakers.

There are also going to be some ups and downs as both the team on the field and the field itself works out some kinks.  We saw what young guys will look like at times, with Soler having a bad day in the field and being tricked into swinging at pitches outside the box (something he apparently never did in Spring Training).  And at the stadium we saw from the outside a great looking visual dedicated to Banks to cover up the construction and on the inside long lines at the bathrooms and concessions and the concessions running out of hot dog buns.

All in all it was a very exciting experience as I watched the game from my brothers place in Wrigleyville and heard the loudspeakers from as far as half a mile away.  The excitement this year will far exceed my expectations of what I hope to see.  I am not looking for a playoff team, I am simply looking for players to play well, and for some improvements to occur so that we have a playoff team from 2016-2021 and multiple chances to win a World Series.  This team is far from complete as is the stadium but it will bring us the experiences of what may happen in the future and that future could be amazing.

So what I am doing for coding?

I am working with Big Data and Hadoop, which is a system developed by Google to parse through lots of data really quickly.  So the program I am working on is simple just so I can get my feet wet and develop my ability to do things with data that can hopefully land me a front office gig.  My first program will be to take the overall batting average and ERA for both leagues combined since 1871- 2014.  The idea is to see if there is any correlation with batting averages and earned run averages, and see how much it correlates, and eventually do this with many batting categories.  So right now I am done figuring out how to get the averages, so if you are a baseball fan and can’t read code you can start to zone out here, and skip to the bottom.

Here is the Mapper Class, that reads from a comma separated value file and passes to the reducer.  There are comments in the code that show what each line does.

package BaseballStats;
import java.io.IOException;
import java.io.StringReader;

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

import com.opencsv.CSVReader;
public class StatMapper extends MapReduceBase
implements Mapper<LongWritable, Text, Text, Text> {

private Text Year = new Text();

private Text info = new Text();

public void map(LongWritable key, Text value, OutputCollector<Text, Text> output,
Reporter reporter) throws IOException {
// TODO Auto-generated method stub
//Text being converted to String and set to line

String line = value.toString();

//CVS object created to take in line values through String Reader
CSVReader R = new CSVReader(new StringReader(line));
String[] ParsedLine = R.readNext();
R.close();

//String values being set to Text value.
Year.set(ParsedLine[1]);
//take in atbatss andd hits and put into one value
info.set(ParsedLine[6]+”,”+ParsedLine[8]);

output.collect(Year, info);
}
}

Here is the reducer Class, that takes atbats and hits and the year and creates a league average for that year:

package BaseballStats;
import java.io.IOException;
import java.util.Iterator;
import java.util.Map;
import java.util.TreeMap;
import java.util.regex.*;

import mrtools.CountMap;

import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class StatReducer extends MapReduceBase implements Reducer<Text, Text, Text, Text> {

public void reduce(Text key, Iterator<Text> values, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
// TODO Auto-generated method stub

float AB = 0;
float H = 0;

while (values.hasNext()) {
//take in value and split it into two ints
String v[] = values.next().toString().split(“,”);
int atbats = 0;
int hits = 0;
try {
if (Integer.parseInt(v[0]) != 0 && Integer.parseInt(v[1]) != 0) {
//need to take items from array and setting as ints
atbats = Integer.parseInt(v[0]);
hits = Integer.parseInt(v[1]);
} else { continue; }
} catch (ArrayIndexOutOfBoundsException | NumberFormatException e) { continue; }
//totalling atbats to year value
AB = AB + atbats;
H = H + hits;

}
if (AB != 0 && H !=0){
//division to create batting average
float average = (H/AB);

Here is the driver that starts the program:

package PitchingStats;

import java.io.IOException;

import javax.xml.soap.Text;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;

public class CountJob {
public static void main(String[] args) throws IOException {
JobConf conf = new JobConf(CountJob.class);
conf.setJobName(“Batting Average”);

conf.setOutputKeyClass(LongWritable.class);
conf.setOutputValueClass(Text.class);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(Text.class);

conf.setMapperClass(StatMapper.class);
conf.setReducerClass(StatReducer.class);

FileInputFormat.addInputPath(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);

}
}

The output of this code looks like this:

1988    0.255
1989    0.255
1990    0.259
1991    0.256
1992    0.256
1993    0.266
1994    0.271
1995    0.268
1996    0.271
1997    0.268
1998    0.267
1999    0.272
2000    0.271
2001    0.265
2002    0.262
2003    0.265
2004    0.267
2005    0.265
2006    0.270
2007    0.269
2008    0.265
2009    0.263
2010    0.258
2011    0.256
2012    0.255
2013    0.255
2014    0.252

I included these years because you can clearly see where the steroid era kicked in.