Jared Dean is a Principal Data Scientist at SAS and an Adjunct Professor at Duke University. Jared is the mastermind behind the batting lab at SAS which was the topic of one of my most recent videos. In this conversation, we learn about common misconceptions about statistics, why he left a management role to go back to an individual contributor position, and of course the whole story behind the SAS batting lab. I loved this conversation with Jared and I hope you enjoy it as well.
More on the Batting Lab: https://www.youtube.com/watchv=0ItYIoOrrUs&ab_channel=KenJee
Jared's Linkedin: https://www.linkedin.com/in/jareddean
Transcription:
[00:00:00] Jared: There's one thing people take away from this podcast or this episode, it's, you know, showing, being able to show people and to tell that story and convey emotion or, you know, meaning to something is gonna make it far more memorable. And you're gonna find a higher success rate kind of, you know, that's, it's an aspect of, you know, that typical data scientist is being able to tell a story and make that meaningful. And that's, I think that's the impact that it has.
[00:00:32] Ken: This episode of Ken's Nearest Neighbors is powered by Z by HP. HP's high compute, workstation-grade line of products and solutions. Today, I had the pleasure of interviewing Jared Dean. So Jared is a data scientist at SAS and an Adjunct Professor at Duke University. Jared is the mastermind behind the batting lab at SAS. That was the topic of one of my most recent videos. In this conversation, we learn about the common misconceptions about statistics that Jared sees, why he left a management role to go back to an individual contributor position. And of course the whole story behind the SAS batting lab. I love this conversation with Jared and I hope you enjoyed as well.
Jared, thank you so much for coming on the Ken's Nearest Neighbors Podcast. Obviously we met in person when I was down at the batting lab at SAS, and that does happen to be one of your babies. So I am really excited to learn more about first year story, but then also the origin story behind the batting lab and maybe a little bit more of the details and the, and the finer working parts of how that was all put together. So again, welcome to the show and thank you for coming on.
[00:01:32] Jared: Thank you, Ken. It's an honor to be here. I've listened to the show and after meeting you, I just think it's a great thing and I'm happy to be able to be a part of it.
[00:01:40] Ken: Amazing. Well, I'm super excited to have you here. We had such a good talk last time and I'm sure we'll keep that rolling into this podcast interview as well. So the way I like to start every podcast is asking how you first got interested in data. So was this something that was a single pivotal moment? They said, Wow, I love this domain or I'm interested in it, or was it more of a slow progression over time?
[00:02:04] Jared: Yeah. So it's a great question. I don't really know. Like I've never thinking back. I was always really good at math as a kid and I had kind of engineering mentality of tinkering with stuff, playing things. I mean, I was the kid who took stuff apart, my parents' stuff, and then couldn't always get it back together. So I don't know if it was a distinct moment or if it was a progression, but I don't remember a period of time when it, when I wasn't interested in how things worked.
And just kind of understanding deeper again, I couldn't always get it back together, but I was want interested to kinda diving deeper understanding how things worked. My mother hated, you know, like I got a hold of her vacuum one summer and it was just never quite the same afterwards, so, mom, I apologize.
[00:02:44] Ken: No, I that's so cool. I actually, I was similar in some sense until I had one experience that probably killed my mechanical engineering, any prospect of that I was taking apart. One of those disposable cameras, they, it says explicitly not to do. And I touched the battery and I gave myself a brutal electrical shock.
And after that I was like, I'm done, taking this stuff apart. I remember that so vividly, that was almost as bad as the time I accidentally peed on an electric fence, but that's, that's a story for another time.
[00:03:19] Jared: The accidental part is what I'm curious about, but yeah. Okay, fair enough. Another part.
[00:03:24] Ken: So I'm very interested to hear about sort of your career path that led you to where you are now at SAS. So I think a lot of us, you know, we start in a, in a profession, we start down, I guess our normal life in college or whatever it might be. And if we look back, we're a little bit surprised at how we got there.
Do you have one of those stories or was it very linear? How you ended up where you are now?
[00:03:48] Jared: So, yeah, great question. I don't think it's that linear. So I, again, I have that kind of engineering background or thought I wanted to be an engineer. And then I, so that's what I did my first couple years of undergrad. And then as I looked around, most of the people in the program were ending up in Detroit, working for automakers at the school I was at. And that just didn't sound that appealing to me. And so I started looking around for other other things to do. And I have a cousin who is a, was a statistician and they were in grad school and they're like, Oh, well, you know, statistics is pretty cool.
Cause you get to, you know, I talk to 'em about kinda like what I like to do. And they're like, well, statistics is cool. Cause you get to play in everybody else's backyard. You're an expert in one thing and you get to interact with them in their domain. And so the senior year in high school, or sorry, senior year in college, I changed majors, you know, did the whole statistics core, whatever graduate.
And then my first job was at the census bureau. I was a mathematical statistician there and I kind of got my first touch into, well, what I think we now call data science, but at the time it was just kind of statistics or data analytics. And so my, the project I got assigned to work on was evaluating the quality of the of the census bureau keeps a, keeps a list called the master address file of every possible housing unit in the United States.
And so I was assigned a project to basically go figure out which geographic areas in the country had been the most over counted or undercounted during census 2000. So we had a budget, we had to give reports to Congress, all that kinda stuff. And I got to kind of think about how do we organize this. And from that point on, I was kind of hooked in trying to understand, take large amounts of data and try and make sense of that, those piles of data.
And that's largely what I've been doing since I worked in census bureau for several years. And then I moved to SAS and, you know, here I've been able to work in roles of developing software to help practitioners. The last few years I've been mostly working on open source and SAS integration. So our Python and SAS kind integration, but I've also gotten to work a lot with customers, building applications, helping them improve their applications, understanding why things weren't working and applying, you know, machine learning algorithms, computer vision, natural language processing. So, and the all that science has evolved a ton over the last. 10, 15 years. It's kind of amazing what we do now and take for granted with how, what we tried to do, you know, 10 years ago.
[00:06:14] Ken: What have you seen evolve the most since maybe your time at the census bureau to now?
[00:06:20] Jared: So I mean, obviously computing has gotten a lot cheaper and because computing has gotten cheaper I don't think you have to be quite as efficient in what you were trying to do.
Like I remember at the census bureau trying, you know, rewriting code 2, 3, 4, 5 times to try and scrape out, you know, how can I pass over this data? Just one less time or two less times. And now computing's gotten so cheap and it's so easy to kind of scale out horizontally that I don't find myself doing that nearly as much that my time is more ..., I'm a better programmer now, or I just make more, but my time isn't useful in trying to minimize that part.
It's more getting more concepts out. So I think the cheaper computing has made that simpler also tools have evolved. I mean, I know SAS has evolved their tools quite a bit. And then open source has also improved dramatically you know, in the last 10 or 15 years.
[00:07:16] Ken: So, I'm interested in that efficiency versus ease of use sort of conversation. So for example, Python is a very interpretable language compared to C it's very slow or something along those lines. Right? A lot of the coursework I took in grad school when I was studying computer science was about algorithmic efficiency, Big O notation, figuring out you know, what, what type of searches faster, this and this and that, which I think in some sense is very important.
But as you've described, a lot of the times compute does have faster a computer now working on something, let's say it's like a sorting algorithm will brute force it faster than a computer from 10 years ago would work on a like using word sort or something along those lines. Do you feel that it's less important to teach the efficiencies now as a teacher yourself? Or do you think that those are always gonna be an important bedrock for us to understand the implications of our, of our analysis in our models?
[00:08:17] Jared: So, I think that, I think it's valuable to understand how it works. I, and so I would never, I would never want to take it out of the curriculum. I don't know how much time, like, I agree with you. Like when I was in grad school, you know, we had whole classes talking about Big O notation and trying to solve problems. And I don't think it should, I don't know that it needs to be that as important, but, and part of it depends on what you want to do going forward. I mean, there are people who are still working at the edge, you know, people who are working on either core Python libraries, or, you know, at SAS developing SAS procedures that are still writing most of their code in C because it just is faster.
And so they obviously need to understand that and we wanna take advantage of that. But I think that the day to day pro the day to day practitioner doesn't necessarily run into those problems. But when they do. I wouldn't want my employees, my colleagues, et cetera, to have an idea of how that, how they would attack that and how to think about that from a Big O notation perspective.
So I don't think it's something we should get rid of, but at the same time, I don't think it's something that everyone has to walk around quoting Big O notation stats to each other.
[00:09:30] Ken: Yeah. You know, I personally think that that's a major misconception that we see with a lot of the interviewing process. A lot of it is still, you know, reverse this linked list, do this and that. And it it's a lot more focused on things that aren't ne necessarily relevant to your work, but relevant to the theory around your work. And I've always been curious of how indicative that is of. Performance on the job, versus just if you're good at remembering things or good at understanding very high level theory. What other, you know, major misconceptions are there maybe about the math or the statistics or the data literacy domain.
[00:10:10] Jared: So, and again, this part of this is my bias as a statistician, but, you know, I feel like in high school I have, I have four kids the second just graduate from high school. And so, you know, one of the things that frustrates me as a statistician is that, you know, the crown jewel of ma high school, math is BC math or AP, you know, calculus and not, and I think it should be statistics, whether it's AP statistics or on, you know, some kind of statistics class.
Imean, I do some calculus on a regular basis, but I do way more linear algebra and I do way more reading of, you know, probabilities and likelihood and things like that to happen. And so I think the us education system should really focus more on statistics and probability and understanding chance and likelihood than on how to do an integral or, you know, what's the derivative of, you know, 3x squared.
So that's, and, but I realize that's partly my bias, but I think we spend more time trying to talk about, you know, about calculus that most people will never use and less time talking about statistics than we should when given you can't open up any major newspaper today and not see some kind of stat, some kind of information, you know, talk about weather likelihood, political, all that kind of stuff. Like statistics in interpreting those is essential.
[00:11:28] Ken: Yeah. I will say, I think you can't open up any major newspaper without seeing an article where someone misrepresent a statistic as well. So, you're you're right. There's that additional for...
[00:11:42] Jared: And absolutely. And educating people to be able to understand that, to help improve their BS detector. You know, would be valuable as well.
[00:11:52] Ken: Yeah. I mean, even something you just mentioned meteorology or the weather, I think it's not commonly known that when it says, for example, a 40% chance of rain within a certain region, it means that there is a hundred percent chance of rain over 40% of that territory, which is the exact same thing in some sense, right?
Or it's like, there's a 40% chance if you're in that territory that you're gonna be reigned on, but it means, but the way you interpret it can mean something very different to you or to who you're communicating with or whatever it means. And that's just a very small example. I hope I'm right about that.
And that's not just like a myth that someone has, has told Mary that I've read about, but you know, there's nuance to all these things and understanding that is something that I think should be absolutely taught within within school, within even outside of school. Like you're kind of doing with the batting lab project, which we'll get to in a little bit here.
[00:12:47] Jared: Yeah, no, it's absolutely true. I mean, it's just understanding, yeah, under understand what, what is a 40% chance of something happening? What does that, you know, actually mean? Like if it happens, is that really a surprise? If it happens, you know, is a 20% chance of something, is that if that event occurs, is that, should that be really be surprising given it's that's a one in five chance.
I mean, shooting free throws, you know, in basketball or something like that. 80% chance, you know, if Steph Curry is the 90% free throw shooter misses two in a row, should we be shocked or is that gonna happen over the course of, you know, 80, 90 games?
[00:13:22] Ken: Yeah, exactly, exactly. So, you know, something talking about this data literacy, talking about understanding domains. And also, I briefly mentioned that you, you do some teaching outside of work, what are ways that people can make an impact outside of their maybe career using data using these types of tools and maybe just this knowledge that we have.
[00:13:48] Jared: So, you know, one of the things, so I think there's lots of ways to do data for good and impacts. There's lots of people who are trying to help improve. I think, you know, if I was giving advice to kind of early career professionals, one of the things is basically to find something that you love and then spend some time learning it or doing it.
One of the things I did that I think has paid huge dividends as early in my career, I kind of dedicated an extra hour in my workday to learning to doing something I was passionate about. And sometimes that was related to work. Exactly. Sometimes it wasn't you know, the website Kaggle I assume most of your listeners have probably are familiar with Kaggle.
You know, I, early in my, or when I first joined at SAS and I was working in the front and SAS enterprise mine, which is this data mining product. I was one of the first, I think Anthony, I talked to Anthony go, I think it was one of the first 50 or 60 users at Kaggle. And I would use that site in the contest to basically test our product. Like, could I solve these types of challenges? And could I play with data? Just practicing is gonna be, I think the best way to learn how to become more proficient and how to understand the things that you're working on.
[00:15:02] Ken: That's amazing. Sorry, my earphones are going in and out a little bit here. So I, you know, like the practice, you know, getting on Kaggle, getting on all these different things Is there a way that you recommend maybe more specifically targeting outside of just just what you're interested in.
Are there resources out there maybe outside of Kaggle that are, are good use cases for people to do posi to make positive impact? I know there was a lot of stuff around the coronavirus and having data accessible there. Are there resources that you particularly like or platforms that you particularly like to communicate on that you think people can make a difference on?
[00:15:45] Jared: So, yeah, I mean, you know, I love government webs, government sources because the data's, you know, it's pretty standardized, it's reliable. And again, having been a former federal employee you know, I understand kind of the effort that goes into making that data. Accurate and correct. And so and then, you know, again, I do some work at Duke I'm an Adjunct Professor in Public College of Medicine.
And so we do a lot of health statistics type stuff. Oh yeah. Sorry. But we do a lot of health statistics stuff. And so datasets from, you know, the national suit of health or the veteran's affairs. There's a lot of interesting things there, coronavirus data as well. But again, I think more important than the source is kind of what are you passionate about?
If you're passionate about formula one, you know, there's datasets out there that you can go get, if you're passionate about sports betting or fantasy football, there's datasets that are there. I think being interested in the topic is gonna be far more motivating to try and learn and understand phenomenon than necessarily a class or something will give you.
It's kinda that interest in that drive to, you know, to win your fantasy football league or to. You know, to, to learn something about cancer research, cuz you have a family member who's been affect or something like that. I think that that internal drive is gonna be the far, the most helpful.
[00:17:09] Ken: I love that. And so what are some of the things that outside of work interested you the most personally?
[00:17:15] Jared: So with, so again, I, you know, I married and have four children and so the kids have been in involved in a variety of sports and academics .... So I've had kids that play baseball, which, you know, it kinda lead into the batting level a little bit.
But also kids that swam and we were looking at, you know, swim times and how those change over time and how kids grow. And I looked at a project. Could you basically predict 6 months or 12 months out, who's gonna qualify for the Olympic trials using swim data or can you look at the progression of how athletes have improved?
We were looking at my one of my sons worked on a project to try and looked at kind of housing insecurity with kids in where we live and talked about. And we looked at a project about how could we help donate backpacks and school items for kids at the beginning of school.
One of the things that lots of things are available at Christmas time, but kids feel insecure about going to school without a backpack, without paper, without pencils, and having to ask for them. And so lots of kids are likely to drop out in the first quarter and he found that through some research and things like that.
So he took on himself to kind of create a project where he would, you know, he donated hundreds of backpacks of full of school supplies for kids. So it's kind of looking around around you to kind of understand what's what are situations. And if you read, you know, read about, you know, social justice or inequity or things like that, if that's an area you're interested in and then how can you try and modify or make some kind of change in your local community?
But I think it's getting involved in being passion. Passion, I think is kind of underrated in the importance of following through and learning. You're a lot more likely to learn about something you're passionate in than something you're not.
[00:19:06] Ken: Something I really loved about what you just mentioned is that it seems like almost anything can be data-related project. If you want it to be, you might have to go collect some data. You might have to see what's out there. You might have to scrape something, but any problem that we have out there does have some form of data component. And with this skillset, or if you're trying to build this skillset, there's a way to integrate it into most things.
You like, even like a simple hobby, even if you like love anime or something, you could make yourself a recommender or, you know, there, there there's a lot of information out there, which is a beautiful thing.
[00:19:42] Jared: Yeah. I mean, I do, I think that, yeah, it's, I mean, and some of those projects that are the hardest kind of acquire data or make data are oftentimes trying to be the most interesting if, you know, you know, kind the Chronicle dataset for learning, you know, for data science on KA is that Titanic dataset, which if you're interested in shipwreck in the 19 hundreds, I guess that's pretty cool.
But, you know, you might, you're probably interested in something else. And so finding some data or analyze, you know, gathering data for a while then kind of look and figure things out. And partly I think gathering that data gives your mind time to think about and form hypotheses and strategies. I mean, I know that I'm a little weird, you know, I think about almost everything through kind of a data analytics lens.
But, but I think that's, I mean, that's just how I process the world and think about in gathering data, you know, does these do these things make sense? Is this a phenomenon that's unusual? Is this consistent? Can I see a pattern here? Can I find patterns in life or in daily life?
[00:20:43] Ken: I think maybe for the general population that would be considered weird, but I think for the data science population, it's probably considered quite normal. So I wouldn't worry about it too much.
[00:20:54] Jared: I mean, I'm too. If I'm too nerdy for the audience, then I apologize ahead of time.
[00:20:58] Ken: Yeah, I sincerely doubt that that will be the case. So, you know, something we didn't really touch on yet. And I think you, very might have slightly left off of, in your, in your lead up story is that you did take a risk and you joined a startup and I'm interested to hear that story. I'm also interested to hear what you learned from it and if it was worth it.
[00:21:23] Jared: Yeah. So I, yeah, so I'm a boomerang at SAS, so that means that, you know, I left and then I came back. So, in 2014 I had just published a book about big data, data mining, machine learning. And I go contacted by a venture capital group about a mobile analytics startup that was starting and looked like great opportunity and a chance to do, you know, she said, it was a risk.
And in hindsight it was, well, it was, it was really beneficial to my career. But the actual app, the actual thing wasn't necessarily what I had hoped or thought it was gonna be. And so I learned a lot about what the right questions were to ask how to represent better, understand, you know, kinda a situation.
But, you know, I went to the startup, I was a CTO and got to do lots and lots of hands on stuff. It didn't start up. And so, you know, then Iwas able to return to SAS, but I gained a lot of knowledge about, you know, doing, you know, working. So SAS is a company of 15,000 people. And so there's lots of departments and things that take care of functions for me as a practitioner, there's an it staff and things like that.
And working at a very, very small startup. I got to do a lot of those things and try those things in. It was that opportunity to kind of explore. and in that process, I had to be a maker at SAS for a lot of my career, I had been a manager of people which was great and fulfilling and wonderful, but I enjoyed being the maker.
And so when I returned to SAS, I was in an individual contributor role and I've been a maker kind ever since then. And I enjoy, you know, I enjoy interacting with people and collaborating, and I enjoy kinda have you able to put my hands on things and do things and making stuff has been, so it's been a great career change for me to understand where my, where my passion lies. I don't necessarily that I'm a better individual contributor that I manager, but I enjoy the making part more than I thought I would.
[00:23:22] Ken: What was the biggest personal individual adjustment that you had to make when you came back as a maker versus a manager?
[00:23:30] Jared: So, well, one was, well, one was the. One was the kind of the accountability of you have to get you, you have a deadline, you have to get things done.
If things don't work, it's your problem to fix. As a manager, you are reliant on other people to help kind of build your success like, and so you're, you know, as a manager, my role, or I felt like my, one of my main roles was to kind of remove roadblocks from my employees. And I went to lots of meetings so that they didn't have to, it was kinda a running joke in one of the team in the team I worked in, I went to one meetings, the rest of my team combined so that they didn't have to make, have hands on keyboard.
And so moving into that maker individual contributor role, it was a lot about how do I, you know, making sure I had high accountability and that I thought about it from the lens of the customer, like, how is someone gonna use this? How is someone gonna benefit from this? How do we make this easier, simpler, faster and better in being able to kind of look at those things and take it something from a kind of a conceptual idea to an actual implement.
[00:24:36] Ken: That's amazing. So when, when you're coming back, you, you described to me that your role, you had a lot of, well, a key role in shaping it, right? Yeah. I'm very, can you explain to me what that process is? Like, how, how do you have that conversation? How do you build something like that for yourself?
[00:24:56] Jared: Well, I mean, I was very, I was very lucky you know, to have to be able to work with a mentor. So when I came back, I reported directly to the CTO at the time at SAS. And he basically said, you know, go build stuff you wish you would've had when you weren't a SAS employee, like go, you know, so I got the chance to kind of build my own role and design it, how I wanted to, but, you know, the key objective was make stuff.
That you wish you would've had that cut, you know, that we, where SAS has not met necessarily met the market need or that there's people like you who want to be able to use SAS, how can we make that easier for them? And one of the things that's come out of that is a couple of, you know, open source projects like SASPy and the SAS kernel, which are basically interfaces into SA for non SAS programmers.
You know, they integrate then with, you know, Jupyter and and Python in general to allow non-SAS programmers, to able take advantage of SAS analytics. And it's been, it's been great to be able to work on that and understand both the, you know, the commercial side of software, as well as the open source side and how things are developed and how you can kind of merge those and bridge those gaps to give, you know, to build a community around.
One of the things that I'm, you know, most proud of is that, you know, in SASPy there's, you know, there's a couple of us that are. We're basically paid maintainers. That's part of our, you know, our job description here at SAS, but there's a whole community of contributors. I think it's up to about 20, 25 people that are contributing aspects and part of that code to help improve their, you know, their experience in using it.
And so that's an aspect of open source that I think is fantastic is that, you know, you, it allows you to put some skin in the game and if you see an area that you can improve and you have the time to help, you can help contribute and make help make software and make things better.
[00:26:50] Ken: So speaking on that front, are there any best practices for making the libraries that you use consistently better?
[00:26:58] Jared: I mean, I think, I don't know. I, well, I would hesitate to be an expert and tell people how the, this, I know the best way, the things that I have found, you know, is being responsive. And open to the discussion and to ask the why to be a good listener, to understand the why, you know, when someone's asking for a feature or asking for something, it's trying to understand their use case and then helping kind of negotiate a lot of times what I want.
I always try to avoid kind of falling the pit of doing what I'm asked to do, as opposed to what is the customer trying to achieve thinking about what is the, what is the actual goal? Because sometimes what they ask for is really hard, but there's something that's fair relative. That's a lot simpler that, you know, through a discussion you can kind of get to, well, I, this thing is really hard, but this, I think that's pretty close to what you're asking for is either possible today or is a lot easier to do or implement.
And then I think it's just being clear in your being as clear in your communication as possible about what you're trying to achieve, what your objective is. I think those are the best practices and just kind of over communicating. Is kind the best general principle. But it takes time, and, you know, in the busy world we're in, that's not always possible, but I think that's definitely the best practice.
[00:28:19] Ken: This episode is brought to you by Z by HP. HP's high compute, workstation-grade line of products and solutions. Z is specifically made for high performance data science solutions. And I personally use the ZBook Studio and the Z4 Workstation. I really love that Z workstations can come standard with Linux and they can be configured with the data science software stack. With the software stack, you can get right to work doing data science on day 1, without the overhead of having me completely reconfigure your new machine.
Now back to our show. It seems like on a lot of platforms, either GitHub, Stack Overflow, there's this huge gap between what the questions are asking versus what the responses are returning. And the gap comes from this unarticulated need from the person asking the question, because I find they go down this rabbit hole of they're trying to solve this one thing. And it's like pretty far away from what they actually wanted to solve. And it's like, why are you doing it this way? When you could do it this entirely easier way.
And it's just cuz they didn't know that there was other options to the ways that they were doing it. And Ifind that really fascinating. And Ithink that you're right in the sense, asking good questions or getting feedback or having a conversation, it seems like more effort upfront, but it saves you time and effort in the future in almost everything that you do, which is a pretty powerful concept.
[00:29:46] Jared: Yeah. I mean, and I feel for, you know, and this is where one of the other maintainers of SAS, Tom Weber. I mean, he's fantastic at this. Like, he just does a great job of asking questions back to really understand before he'll write a line of code, he really wants to understand what are you trying to do?
And it's, he is been a great example for how I, how I should do better. But I also can, you know, from the point of maintainer, you know, this code base so well, and sometimes I fall into the pit where someone's asking for something and I think, you know, either that's, you didn't read like, that's, you're wasting my time.
Or like, did I not? Did I over miss something that simple? And then you kind of get fall into that rabbit hole. And so it is a balance, cuz your context shifting back and forth, like, you know, I maintain this project, but it's not, it's not, it's not my full-time job. It's just a part of my job. And so the context switching back and forth, I sometimes fall into the trap of being hurried and rushed.
And I'm sure that's what happens with other maintainers as well, that they want to be helpful. But it's just, you know, time value, money, or, you know, things have shifted. I mean, one of the challenges I think in general with open source is kind of what is the longevity of that particular project. And as a consumer of things, you know, you looking to consume an open source package, its, you know, you have to kinda ask yourself back of your mind, like who implemented this, who's kind of funding it, you know, is it someone finishing their PhD dissertation?
And once they finish it, this things gonna go dormant, you know, what is, what kind of legs and things does it have in evaluating that? It's kind one of the risks. It's one of the, you know, there's, there's definitely pros and cons to open source versus commercial software and that's definitely a risk of open source. And there's no one to pick up the phone to yell at, which is a huge advantage of commercial.
[00:31:30] Ken: Yeah, that makes a lot of sense. So let's talk about the, kind of the other really well, one of the other really big projects you were working on in this role of yours, and that would be the batting lab. I'm very interested to hear the origin story of this. I know one of your one of your sons is interested in playing college baseball. I have to imagine that this project was bred at least a little bit out of self-interest as well.
[00:31:55] Jared: Well, you certainly, there was, there was a symbiotic relationship or there was some synergy there. Yeah, for sure. So origin story is you know, am my role of kind of making stuff or do building stuff we wish you would've had as when I was a SAS employee.
One of the things that I role I kind of moved into was helping do high level demos or support for kind of marque customers, things like that. And so I have. So I would, I occasionally get calls from sea level executives. And so in this case, the chief marketing officer contacted me to sit in and participate on some marketing brand activation type ideas.
And and this was around baseball. And so, you know, it went through a number of evolutions, but basically what we landed on was trying to build a batting cage, a smart batting cage that would help kids get better at their baseball or softball swing and in the process, teach them about data analytics.
So in the, in the south we talk, there's kind of an expression about adding broccoli to the mac and cheese. You know, the kids love the mac and cheese and we kinda sneak in vegetables to the side. And so, and that's kind of how this project evolved is how can we be kind of two goals? How can we show SAS technology and how it can help, you know, solve a problem?
That's, that's relatively easy to understand, but kind of complicated as well. And how can we then help? Cause SAS is, you know, Dr. Goodnight, and this is Goodnight. You know, one of their philanthropic missions is really education. And so this kind of helps meet that data for good and childhood literacy and education aspect of exposing kids to data.
I, you know, I think we talked about this in the video you did, but I mean, I really think that learning, understanding data and how to use data is gonna be as essential to this group of kids when they reach the job market, as learning, as knowing how to read, like it will be essential to be able to analyze data, to look at data and be comfortable with it.
And so that was kind of one of the aspects of how could we, you know, the idea that was originally pitched was, you know, kind of a baseball, softball related. And we tried to refine that, hone it down to how can we help an individual player get better and expose them to data and increase data confidence at the same time.
[00:34:12] Ken: It seems like there's a lot of really powerful crossover too, for businesses and getting them to understand the real value that machine learning is creating. I talk with a lot of companies through my, through my work, through content, through a lot of these things and they have so much trouble communicating with B2B customers, the value of what they're doing, it seems like this is such a concrete way to showcase players come in, these kids come in and they're hitting it this hard.
They do this program where they're, where they're supplemented with the machine learning and human attention. And they come out hitting it this hard to me, that's such a beautiful use case. And again, if these kids can understand it, if these businesses can't understand it at that point, they might have some bigger, bigger issues.
[00:35:04] Jared: Well, yeah, no, I appreciate acknowledging that. I mean, yeah. And that's, and that was kind of the idea and the objective is. You know, by showing by, by showing how SAS technology. One of mean one of the, again, one of the parts in my role has been, I don't want to tell people about what the capabilities are.
I want to show them what the capabilities are by building something that works. And so, and this, you know, helped, there was a lot of things to kinda integrate and build here that demonstrate, you know, so we have assess technology at the core. One of the key algorithms we're using here is a hidden Markoff model which is the same type of technology, same algorithm you might use to do outlier detection.
So, you know, if you're looking for fraud in a banking context, or you're looking for, you know, in a manufacturing line, you're looking for defects or things like that, we're using that same algorithm. It's just that, because of the context here, you know, we gathered hundreds of swings from dozens of players at the North Carolina state baseball and softball programs and kind of built our average optimal division one baseball softball model.
And so now what we do is when the, when the player swings. We're basically looking for how much of an outlier are they? You know, the program is targeted to 10 to 14 year olds. We've had plenty of adults go through as you see. And then, but, you know, we're basically saying if there's a 10 or 14 year old who swings a bat and they don't look like this division one baseball, softball player, it's the, it's the youth player that has to change.
And so then we kind of worked through building a, you know, a little recommendation engine to kind of identify the aspects of that and how can we help give them maximum improvement to make them look more like those NC state baseball and softball players. And so, but it's using algorithms that are, you know, common in a B2B setting. It's just changing the context to make it you know, and understanding that context in order to make it appropriate and useful for helping kids, you know, be better at baseball and softball.
[00:36:53] Ken: So I really want to get into the nuts and bolts of the machine learning and the algorithms that you use before that I wanted to highlight something you said about showing rather than telling. To me, that is one of the most outside of even data science, one of the most important and compelling things that you can do for your career, for your life, for any of these things, for any of the relationships you have the idea of telling someone something is very fickle, right? It could be true. It could be not true.
If you are showing someone through action that you're capable of doing something or that you've done something, or these are the results, it just fires a different part of our, of our brain, our like mimicry system or whatever it might be. And that conveys so much more information, right? Like humans work in stories.
We don't work in facts or data points. And that might sound contrary to our work, but it's one of the most important parts of our work, right. Our job is to get someone else to either use our data, to use our solution, to get value from it. And in most scenarios, the only way they're gonna get something from our work or from our outcomes is by weaving it into a story that they can understand.
I mean, you think not to go too far off tangent, but you think about how information was conveyed over, over time. Historically, before we could even write things down, you had to put things into a beautiful narrative. So it was memorable. So people would go back to it and be able to con convey it going further forward.
And I don't know, to me that just unlocks something that's very powerful within like the human condition. Is that not just in the data domain, like even in, if you're trying to get someone to remember something about you or even your name, if you have a story to it, it'll be more impressionable.
It'll be something that will, will leave this mark on other people. And if maybe if there's just one thing that people could take away from this podcast, it would be that the story that you attach to any of your work is what people are, is, are gonna remember not the algorithm used, not, not the, even necessarily like the way you said it or the quality of the data. It's gonna be the story you attached to it.
[00:39:07] Jared: For sure. I mean, it's far more tangible and as a, as a practitioner, it makes my job a lot simpler. Like I don't have to try and convince you that I'm knowledgeable on the subject or that I know how to do this or that our company is capable of doing this.
It's, I'll show you what we did. And, you know, we, you know, you, we had the ability to do all kinds of different things here. Here's how we actually solved our problem. And because we can solve this problem, we can solve a similar problem that you have. And so, yeah, no showing that if I agree, if there's one thing people take away from this podcast or this episode, it's, you know, showing, being able to show people and to tell that story and convey. Emotion or, you know, meaning to something is gonna make it far more memorable.
And you're gonna find a higher success rate kind of, you know, that's, it's an aspect of, you know, that typical data scientist is being able to tell a story and make that meaningful. And that's, I think that's the impact that it has.
[00:40:04] Ken: I love that. Alright, now that we've gotten that outta the way, let's dive into some of the specifics of this machine learning model. Ithink it's really, you obviously mentioned the hidden Markoff model you described that you used pose estimation to me as well. Can you talk about how those were related? And also a little bit about maybe what PO's estimation is for the layman.
[00:40:28] Jared: Sure. Yeah. So again, what we, what we did is that we captured, so the first thing in thinking about this, how are we gonna help the players improve? And so it all goes back to, well, what's the data source that we have. And so SAS has a very long, deep relationship with NC state, North Carolina State University.
And so we contacted their baseball and softball teams and they were very generous with their time. So we wouldn't gather data. So it's like, let's get some data. We intentionally chose to use a camera for our data gathering as opposed to sensors or some kind vast players wear. And part of those practical, you know, we did this during a so mean and so move, you know, sharing sensors.
We didn't know the practicality of that. Also, I think it fundamentally again, having a, being a baseball parent I know that players and swing differently when they're wearing some kind of contraption versus being on the field and playing. And so we wanna try and make it as natural as possible. And and that made it easier.
So we went and gathered that data. And then what we have to do is we have to turn that data into something the machines can look at. So pose estimation is a branch within computer vision, which is a branch within artificial intelligence. And basically what pose estimation is mapping the joints in X, Y space on a frame.
So you think about a video, a video is just a collection of pictures. And so on each picture we want to label, where are the shoulders? The elbows, the eyes, the hips, the no, the nose knees, heels, toes, et cetera. And so as we, as we pass each frame of a video through. That post estimation algorithm, we get back the X, Y coordinates of all those joints.
And so then what we do is we take that. We have think about that rectangular dataset or matrix. We have the columns across that, all the joints. So we have nose X and knows Y hip, right hip X, left, hip Y et cetera. And then each row in that is a frame. And so we can see how the joints move both together and over a time series through that, through the course of that swing.
And so that's what we fed into that hidden Markoff model is that time series of each swing in the joints. And so it ends up being a 38 dimensional. What we ended up using is a 38 dimensional time series to identify kind of what, how do players move on aggregate to hit the ball as hard as possible.
[00:42:56] Ken: So did you have to adjust the pose estimation model at all? Did you use any transfer learning to, for example, get a spot on the bat as well or anything along those lines.
[00:43:09] Jared: So, we looked at some of that stuff. Ultimately, we didn't. Ultimately, we actually ignored the bat. So pose estimation only looks at the body.
And so we were just looking at how the body moves. If you could get a player, you know, we could have handed them a 32 ounce, two by four, and really kind of seen the same thing in movement. So we didn't, the bat doesn't play any, any component in that. We also, some other practical considerations is we could have taken video at all kinds of frame rates.
We decided to take 60 frames per second was the number we decided on. It was kinda that happy medium. We wanted to be able to see on a frame by frame basis, the small changes that happened, but we didn't wanna go way overboard because each video then has to be each frame has to be processed. And so we were kind of trying to balance the efficiency and the time it would take to process a video on the other side, And how quickly we can give feedback to the player.
And so, you know, in MLB, when there's, you know, some teams are studying pitches, there's cameras that are 17,000 frames per second, that would've just been a nightmare from a processing perspective. So we had to kind of find that happy medium. And then we took, so, you know, first was that gathering the data.
So we gathered the data on those players, hundreds, maybe a thousand swings from those players to then feed that through and build the model. And then we gathered that same data. And so to make it to increase our success rate, we had, I understood, we knew where the cameras would be fixed in the cage. And so we captured that data basically at the same distance.
So cage is about eight feet wide from home plate to the edge. And so I positioned the camera when we were prep taking the videos of the North Carolina state players at that same distance at that same height to make them look the same. We did have to do some transformations to make them heighten variant.
So, what we do is we rescale all the pictures, so that and part of that was I wanted to get, we had a variety in our dataset. We had softball players that were short as five, four. We had baseball players that were as tall as 6'5, 6'6. So we had a lot of variety in our training data, which helped, but none of our 10 to 14 year olds were even close to that size you to six, five.
And so we made 'em all height and variant. We transformed them to make them all height and variant. So this would translate to kids at kind of any age in, you know, adults of any age in size.
[00:45:27] Ken: Is there any issue with, I mean, this is probably more nuance than you needed to get but is there any like, individual body ratios and stuff, the changes we get older, like our, you know, for example, if in kids, their torso grew before their legs, that would make scaling possibly not as accurate, or is that just too far down the rabbit hole?
[00:45:50] Jared: So, it's a good question. It's so it might be an issue. I didn't study it. There are bigger sources of error than, than potentially the awkward size of kids in a ... state. You know, the biggest one of the issues is is just the accuracy of mapping and pose estimation. Pose estimation does the more stable you are, the more static you are, the better it is at mapping, but unfortunately the important parts of the swing are when you are in, you know, your hands are moving and your arms are moving at maximum velocity.
And so I think there's a far bigger source of error there than, you know, the disproportionate size of. Children in puberty or potentially in puberty versus, you know, young adults or what, I guess college students are 18 or 22 more or less. So, I mean, yeah, there's some error there and the other, so there's some, there's some definitely some error and some things there.
One, and that was one of the other things we had to kind of look at too, is pose estimation models generally have there's, there's some different frameworks and algorithms underneath them. And most of them kind of are broken up into a good, better, best, and, you know, the good, better, best, best takes a lot longer to process sometimes in order of magnitude, more than good or better.
And so just, there's a lot of kind of tinkering with what's our accuracy speed trade off. Which is the thing that a lot of practitioners face you know, if this application was done as a more offline thing for, you know, on more detailed stuff, maybe you'd want to go to best because an extra, you know, 10 seconds of processing is doesn't hurt you, but the accuracy is more important.
Or maybe you wanna try and move this onto a mobile device. And so good, you know, the fastest one is what you need, cuz you just have limited processing power, things like that. And those are all considerations that we had to kind of talk about and decide. And then as a practitioner, I kind of made the best decision documented it.
Here's what we did. Here's why we did it and then, you know, commit and move on, but are all things that we could, that we, you know, could be revisited and in different applications, we might have made different decisions...
[00:48:02] Ken: I really like that since this is a public project, you can sort of peel back the layers and the decisions that were made to me, that's one of the things that we can't always get into with our work.
And this is this sort of showcase on that front is exciting because a lot of students or people who are interested in getting into this domain, they think they don't realize how many trade offs go into each individual decision. You have to think about the use case. You have to think about the end user.
You have to think of the value. Speaking of the end user or one of the end users, what were the outcomes for the kids? I know that I believe the six week trial where they, the kids were coming in and seeing how much they could improve is probably over. What type of results did, did they see.
[00:48:47] Jared: Yeah, so just for background. So the kids, so that we had, we had 10 kids, we start with 11 kids. One had to drop out for injury, not during the batting lab, but outside of that. And so they went through 12 sessions approximately twice a week. And so what we, one of the key KPIs we were looking at is kind of is what was their exit velocity.
And so one of the things we looked at, so we measured their exit velocity. We measured the velocity of every hit they had, but in the, all of the players, if you took their maps, exit velocity from the first session, basically how the players were before any instruction and then looked at their exit velocity in session 12 at the end, about six weeks later, all of them improved their max most were about 15% better.
Which is great. the thing that was even more important or impressive to me is that a lot of the players they had, I don't know how to describe this in my hands, but basically, you know, they had a max of say, 53 miles an hour, In session one in session 12, they had 30% or 40% of their hits that were above that high water mark from session one.
So it's not just that they could hit the ball harder occasionally it's that they are now consistently hitting the ball harder than they had ever hit the ball before this started. That's really cool. And so that, I think is that is the, I think that's the measure that was, that's more suppressive and I don't know how I can't, it's not exactly quantified, but that's the idea is how many, how many, a percentage of hits above their high water mark? For some of the, for some of the players was just dramatic.
[00:50:18] Ken: That's amazing. I'm I'm an absolute grill. You hear though. So was there, was there a control, is there able, are we able to see if their teammates or any of those people had had improved in that same period.
[00:50:30] Jared: No, no, no. You know, it's a very fair question. It's a very fair question, you know, and yes, the control. So again, part of it is the focus of this activity was to help kids get better at baseball, but the sneaky part was to help them get better at data and visualizations. And so, yes, we could have done the control where we brought in another set of kids from that same pool and let them hit the ball and not given them any feedback to see how much they would've improved.
And would they have improved? Yes, I'm sure they would have. I dunno how much, but I'm sure it's not zero and it's not, it's a positive non-negative number. But, you know, again, the main point of the project was not to help a few kids get better at baseball and softball. That's just a positive externality it's we wanted to teach about data and get them involved in data.
And so I wouldn't have wanted, I would've, as a statistician, I would've loved to run the control, but from a public good or, you know, and as a parent educational per and as right. And like my, and my kid who, if my kid would've been in the control and just had to go hit a ball, they would've thought that was fun, but they would've missed out on, you know, as a data science nerd, the part that I would be most interested in, which was helping them learn and think data was cool and the data applied to more than just the math class.
[00:51:46] Ken: I agree. And Ithink at least from a heuristic perspective, those types of numbers that they improve by are significant beyond what a kid could see over the course of a season. Just, you know, if you talk to any coach, you'd be like, Wow, that's, that's like an impressive leap compared to yeah. You know, I played baseball for till I was almost like 16 or 17.
And I don't think I ever made more than maybe a couple mile per hour ball speed change less than a couple percent over the course of a full season. Right. I mean, it just doesn't yeah, doesn't happen in that, in that magnitude.
[00:52:21] Jared: So it, yeah, so it, it's a very fair criticism that we didn't have a control, and so it's hard to know. I mean, but anecdotally, you know, a couple of anecdotal things that I can share. So one of the facility that we had, the Cajun he was a, he was a division one baseball player. The owner is. And so we brought him into the cage to do some testing and his, the recommendations that he got back were no improvement needed.
So we at least know that we weren't oversubscribing. We brought another, some other youth test in the beginning and the, she was a softball player. She took five or six swings, and the feedback was, you know, You need to follow through in your swing. You're not you you're stopping when you made contact, you need to follow through.
And she turned and just laughed, looked at her parents and laughed because I guess that is the feedback. Her coach has been giving her for the entire season. And the system was able to kind of detect that in the first five or six swings another positive externality or things we heard about. So one of the participants, his name was drew for the program two weeks after the program ended he actually hit his first home run in a travel baseball game.
And so he was ecstatic and he was one of the kids that had kind of the biggest improvement. So, you know, it's a fair criticism that the kids would've gotten better anyway. And we don't know how much better they would've gotten into control, but that's not ne in my mind, that's not necessarily the point they got better. And they learned about data and, you know, that's the, but it's fair.
[00:53:43] Ken: Well, I think that there's beauty in that too. I mean, a lot of these projects we do, they can be for our own personal benefit. They don't necessarily have to be perfectly statistically significant or aggressively scientifically rigorous.
I would actually argue that a lot of projects that you do have even in production in companies, they're not perfectly statistically significant, right? You're, you're putting out for example, linear aggression and you're, you're leaving the non significant variables in because they're producing better results, not because of what a P value is. Right.
I think that there is that sort of artistry in the data domain that combined with human perception, we can see if something is likely helping or likely hurting or likely net neutral. And that's part of the decision making process. I would rather put something in, into production that I think probably won't hurt us and will likely help us, but I don't know how much it will actually do of each of those things, because the expected value is still positive.
And I think so many people get wrapped up and I know you as a statistician or historically as a statistician, it's, you know, we're taught in school that that's sacri, but at the same time, if you're in a position to create more value for your company at a very low risk of things going bad, why wouldn't you try it and experiment with it? That's also sort of the beauty of what we do is that it is highly experimental.
[00:55:12] Jared: Yeah, no. I, so I teach an MBA statistics class, and one of the things I have a whole lecture devoted to basically the difference between practical significance and statistical significance. Oh, I like that because, you know, yes, we'll do all these problems and we'll find significance, but is there any real practical significance?
And at that point, God gave you a brain to use it. Like we shouldn't be ruled simply by numbers. we don't wanna ignore data. And in this case, you know, we didn't ignore it. We didn't, it's not that we thought we were doing something that we could write a peer reviewed, you know, referee paper on. We knew what we were doing. And we chose that this was the better path because of, you know, what we were trying to gain from it.
[00:55:49] Ken: Amazing. And so my second question, which I did wanna touch on is inferentially. What were the most common things that kids needed to work on? Was there a couple things that you're like, almost all the kids had this problem in their swing?
[00:56:05] Jared: Yeah. So, and it's something we kind of found, you know, the test kids and then the like prior test and then the actual kids during the activation. So one of the key things, the key thing that kids don't do or is they don't keep their hands back. So all of the elite players, their hands are basically directly above their back foot until their front foot touches the ground, or until that toe or heel touches the ground and Vince, when they start to swing and almost no youth player without coaching and correction does that naturally they, their, their concern is that if they don't start to swing, they won't hit, they'll miss the ball.
And so it's helping get that. And there's a lot of physics and phys kinesiology reasons why they have to keep their hands back. It allows them, you know, to open their hips and separate and generate more power. It also forces them to take a shorter path to the ball, which then helps increase acceleration, all that kinda stuff.
And there's a whole bunch of physics and things that we didn't tell the kids about, but that the college coaches all talk about. But that was the main thing is that, you know, if you had one piece of advice for your youth baseball or softball players, their hands have to stay above their back foot until they swing.
And then the other thing is that they want to take, so, and again, I never played baseball as a kid. I've never played an Indian organized baseball. And so with my son, who's, you know, who's an active baseball player and hopes to play in college is I hear coaches say lots of stuff. And I'm like, I don't know what that means.
Like, I feel like I'm, you know, relatively smart and, but I have no idea what some of those phrases mean. And so one of the things we tried to do here was break down what is, what is the action the player's actually supposed to make? And so the other thing that we saw kids do is they don't take that shortest path to the ball.
They, they understand the geometry rule, but they don't know how to apply it. And so it's about, you know, moving, you know, the not with the back kind of in a straight line is the other way that they help generate power. So that's the other thing that kids don't do. And I've heard coach Sal kinds of weird things that didn't make any sense to me.
[00:58:08] Ken: You know, it's funny you say that. So I, you know, I played golf in college. I tried to play professionally for a while and I remember I was working with a swing. and he kept telling me, he's like, you need to feel like you're releasing your hands back here. And I was like, what does that even mean, dude?
Like, should the club be actually releasing back there? Like, what is the club actually doing in that part of the golf swing? Because me feeling that is not gonna be reliable, what you feel doing that versus what I feel is gonna be not. And I got in a shouting match with this gotten range. I was so pissed because he could not explain it in another way.
And I was like, look, that's just not gonna work for me. I stopped working with that coach. I was a little bit of a Divo when I played, but the idea though, is that like I want is close to the ground truth as possible and able, and a lot of coaching does not focus on the ground truth. It focuses on feelings.
It focuses on this and that. And if there's jargon, if there's these other things that are involved, I feel like that in some sense, also detracts from the learning process for kids, because they're always gonna be chasing that feeling. They're gonna say, Oh, I wanna feel this. I wanna feel this. I wanna feel this.
And then after a while, when your body adjusts, that feeling is gonna produce completely different results. So understanding the actual physical locations of things, the actual, like pure positioning and the more quantitative elements of the swing, I think has really good benefits. And if kids are starting to learn like that, and you know, that's gonna produce better dividends in the future with more repeatability than just chasing after all of these different feelings.
[00:59:43] Jared: Yeah, no, it was, we, yeah. We tried to study those hundreds of swings that we had gathered from the players to see what do elite players actually do. and so that's kind of what we focused our feedback on. So, and then I, so a couple points that one of the things that was really impactful for the kids was that we showed them slow motion videos of what they looked like.
And so they were able to watch themselves, which was really, really powerful when we said, you gotta keep your hands back. The computer told them, you need to keep your hands back. When you swing. Then they saw a slow motion video where they didn't keep their hands back. And so it helped them. and so one of the things is, you know, as we planned, as we knew that not everybody in the world would be able to take advantage of, you know, getting into the batting lab.
One of the things that we did is we produced a data playbook, which is a free you know, interactive guide that anybody can get access to. And it those, those key learnings that we had and the instructions and the drills are all contained within that data playbook. So, you know, and it's designed for a player and a friend or a caregiver of some kind doesn't have to be a baseball or softball coach.
Doesn't have to know how even what you know, what to do with baseball or softball, but it gives, you know, activities and gives homework and it lets you kind of chart and graph some of the things that you're seeing. And so you can have a friend, you can have a grandparent who kind of, you know, can hold up a, you know, cell phone, video and take a picture of swinging.
And then you can kinda look at that and use the data playbook to help improve your swing. And so that's a resource that, you know, we knew that not everyone be able to take advantage of this, but we wanted to try and make something that was accessible to everybody.
[01:01:16] Ken: That is awesome. Awesome. So what, what are you most proud of related to this project? What do you think the biggest takeaway for you personally is?
[01:01:26] Jared: So, I mean, I guess the thing I'm most proud of is that it worked like, you know, when we first started out, you know, Imean, I got asked this by, you know, a number of senior leaders at SAS is basically like, Okay, This isn't necessarily cheap, is it gonna work? And so, and I said, yes, and I'm glad that it worked. You know, but it was...
[01:01:47] Ken: How, what percent sure would you give yourself in retrospect that it would work.
[01:01:52] Jared: Well, Okay. So in fairness I had, so because again, I have a baseball child and I don't know anything about baseball. I kind of boiled back to, so I had done some pre-work before this project ever materialized about, Okay, how can we take cell phone video, and how can we kind of analyze that a little bit and break, you know, if you have a one minute video of swings and a batting cage, how can you kind break that into swings and look at them and things like that.
So I hadn't, I had done some work with this, so it wasn't a completely brand new problem. So I was pretty confident, but, you know, I wasn't on a hundred percent, Imean, if, I guess I'm having to make predictions, I'd probably say I was like, you know, 75% sure it was gonna work. You know, and there were challenges way along the way in things that we didn't foresee, but.
you know, I was happy. It worked that, I mean, for a personal benefit, I was like, Okay, this worked I'm happy. You know, it was great, like my own child benefit. I mean, yeah, I got to use them as the test dummy for a lot of this stuff and practicing and technology and things. And it was a great time to work with him and his swing has improved too.
and then seeing these kids improve and to be more confident in data, like, you know, again, as I said, it's just such an essential skill. And because of my background, you know, my kids get exposed to a lot of this and probably your listeners here that have children and things, their kids probably get exposed to this, but this is a skill it's essential for everybody.
And so being able to kind of bring this skill to people who might not talk about, you know, sort, you know, sorting algorithms or number theory, or, you know, even baseball statistics necessarily that much at home to be able to kind of bring that and increase their exposure to it, and then help increase their confidence. You know, I think is this really ...
[01:03:38] Ken: Yeah, I mean, at least to me, it seems like this is a hundred percent worthwhile. I love the broader message. And as you can imagine, I'm someone that cares very deeply about spreading awareness of data, data literacy. I mean, it's something I've dedicated quite a lot of time and effort to as well and have, have, you know, made a very nice place in the community around.
So this to me is very in line with a lot of the things that I care very deep, fairly deeply about something I am interested. And you mentioned it offhand, what was the biggest bottleneck of the project? that seems to me, like, Ilove talking about the problems people face and how they get around them because it shows that we all have these problems.
[01:04:19] Jared: Oh yeah, no. This was filled with projects. I mean, the biggest problem was kind of getting. Data acquisition, a whole scoring system was kind of the biggest problem. And, you know, part of it had to do with some technology, things that we had planned on working that didn't quite work. And when I see the whole system, so we were able to take camera of the footage that we got relatively easily and straightforward.
But then one of the aspects of the bating cage that was unique is the batting lab is that we had weight sensors in the batters boxes and synchronizing that data. And so getting a system that kind of worked with those two things proved to be a lot harder than we thought in that we thought we had some commercial software that we could use to kind of synchronize those up.
But then that ended up not working and we kinda went through that. And so when I thought I was gonna have kind of synchronized data, you know, five months before delivery I ended up getting it more like four or five weeks before delivery. And so, you know, and then we had some, some, some COVID quarantine issues and some other things like that.
So some of our system tests got canceled. So that was definitely a challenge in the project that I didn't anticipate didn't kind of plan for. And it ended up re costing me a few, some very late nights, you know, trying to make sure we got ready for opening day.
[01:05:33] Ken: That's why you were so nervous watching me hit balls in the cage.
[01:05:37] Jared: That's right. Yeah, no, I had, I had to, yeah, like, Okay, here's the magic buttons to, you know, put, to type in numbers. I was actually, somebody was talking to me and they're like, well they, they, they hadn't quite grasped that this was like an automated system that like, my work of this was all done before anybody stepped in the cage.
And so they're like, you know, while the kids are like, so are you, you know, here all the time when they're hitting, I'm like, no, I'm not really here that much. You know, I kind of pop in and watch and make sure everything's working. But they thought that somebody was there, like manually reviewing every swing that happened, that it wasn't a system or a computer that was doing this work.
And so once they saw that, it kind of thought like, Wow, so like the computer is doing this, but just being able to kind of create. A system, you know, that that functions and that kids can interact with, you know, even, you know, even with rudimentary interaction you know, it was just very, it was just great to see, you know, kids being able to do that.
One of the things that we, one of the design aspects of the cage sorry to all those parents out there, but we designed the cage to be a really intimate, immersive experience inside and not really the best viewing experience outside because we didn't want necessarily mom or dad or parent uncle, whoever back in there, kind of talking over the computer or giving extra correction or coaching to the player.
And so we wanted to try and make that as an immersive experience as possible and help them block out kind of everything else they could just focus on. What is the task at hand? What are they need to do and giving them. Moving them through the progression of kind of telling them what they needed to do to improve to then moving through towards the end, kind of asking, prompting them, like, what things do you need to do and improve.
It's helping them. I self-identify, how could their swing improve? What is the data showing them that they need to be doing? That was, you know, gratifying watching kind of evolve and learn through that from session one to session 12.
[01:07:30] Ken: Yeah. I mean, that to me is, well, first I was really impressed with the sort of augmented reality. Component of the front end that you created. Did you do that on your own or did you have a software team?
[01:07:43] Jared: No. No. Somebody else did that. Like I, no, I like, I'm more green screen and, you know, green text on black, on black background is more my style, no, the company we worked with they did a fantastic job.
They've done things installation. So before, you know, they worked on Drake's last concert tour. Like, so they do amazing work. I could have never imagined or dreamed what they came up with. It was far more than anything I would've designed.
[01:08:09] Ken: Of course can check out what it like in the video that I made quite recently.
[01:08:14] Jared: Yes, yes. I hope they do. I mean, my wife described it when she first saw it the first time. Cause I had talked about it a lot. And then when she came, she's like, Wow, this is like a Disneyland ride. This is way cooler than you had described it, so.
[01:08:27] Ken: That is incredible. So I only have a couple more questions largely about SAS in general as a company. So this seems like a pretty cutting edge project. Obviously SAS is a little, has been around for a long time. How does a company that's older, like SAS keep up to date with the newest trends in data? You know, what are some of the things, especially like related to open source that you're doing as an organization that allows you to stay relevant there?
[01:08:53] Jared: Yeah, so, I mean, so yeah, SAS was again originally developed. The foundational project was a, was a a grant from the department of agriculture back in the seventies is kind of how SAS got its start. But you know, it's been a commitment of SAS, you know, to continue to provide its customers with the things that it needs to do, its job SAS isn't purchased it's a lease.
And so every year you have an opportunity to renew your license and people get to vote with their, with their wallets as to what they wanna do. And so it's our job as employees and as a company, to make sure that we provide customers enough value that they want to continue to renew. You know, I, there's a large amount of emphasis put on as developers, as people in R&D of understanding what are the latest trends and attending conferences, reading papers, and staying current you know, you wanna make sure that you're delivering.
Cutting edge stuff. and a lot of the people in, you know, in these groups, you know, they have, you know, they have PhDs and posts in the areas that they're working on and, you know, they're, they're trying to make sure that they're moving, moving the science forward and providing, you know, accurate and detailed information. You know, it's just part of the kind of corporate culture to make sure that you're delivering, you know, excellence to our customers.
[01:10:20] Ken: That's awesome. And so, you know, something I'm also interested in you, you, you mentioned that you're a boomerang at the company what's changed from when you started versus to where you are now, organizationally.
[01:10:35] Jared: Organizationally. So I mean, you know, so, you know, pro the tools and things have changed, but a lot of the processes and the corporate culture have not in that, you know, it's a, it's, you know, it's, it's a demanding culture in that you, you know, you wanna do good work, you work with really smart people, and you wanna make sure that you are not the weak link in the chain, I guess is part of my motivations.
I wanna make sure Im contributing to my team. I'm not letting my team down. But I work with some of the smartest people that I've ever met and you know, and our tools have changed. And, you know, when I started, you know, it was more big Unix boxes, you know, a or Solaris boxes...
Whereas now, you know, it's all Linux and it's Docker containers and it's, you know, it's Kubernetes and it's things like that. And so, you know, those tools have changed, but the, a lot, largely the culture and the idea and how we go back developing software, you know, it's changed from a waterfall method.
You know, to an agile method, things like that, but it's about hiring great people, giving them the motivation, the idea about what they should be doing, and then letting them do their work and kinda seeing outta their way. And I think that has been a consistent theme, you know, through my 17 years here minus the boomerang years.
[01:11:55] Ken: Amazing. Well, Jared, those are all the questions I have. Do you have any final thoughts? As well as where can, when, where can people find out more about you?
[01:12:05] Jared: Yeah. So thank again. Thank you for having me on you know, to find out more about me. You can find me on LinkedIn and Twitter. You know, the batting lab, you can see that if you just hopefully do a Google search for the Batting lab, it should come up.
But also on the sas.com website, you'll see information at the batting lab and there's also a link there to the Data Playbook as well. And so again, if any of you have, you know, children or friends or nephews or nephews or something, you wanna work with them, they want their interest in baseball and softball.
I think it's a great resource that they can use. And you know, you know, I guess my advice for people who that I think the audience of this podcast is that, you know, you wanna find something you're passionate about and that's gonna help you build a long time success, like find something that you want to do.
And then you'll learn about the things that are necessary to do that job and to do that. And that's been kind of a north star I've followed in doing some finding things that are interesting, trying to make any job. You have interesting, you know, automate out the boring parts and kind of do, trying to do more of the interesting parts and you know, I've had a great career kind of following that mantra.
[01:13:18] Ken: Amazing, Jared, thank you so much again, this was an awesome conversation. I'm so excited to share it with everyone.
[01:13:23] Jared: Thank you, Ken. I appreciate that.
Comments