Big Data on a Campus - Asking the Right Questions
Last year, it seemed impossible to read any higher education journal that did not have at least one article on MOOCs. The MOOC furor is slowing but is being quickly superseded by the salvation of “Big Data.” From student success, improved administration efficiencies to research analytics, “Big Data” is the proposed cure for all that ails a campus. If you believe all the hype, “Big Data” will be able to predict when or if a student will graduate, what their salary will be in ten years and when we are going to die. I am very interested in the later. It is like the TV magic of using forensic science in crime shows. With the touch of a few keys, data from diverse systems, in different data formats and locations, can magically translate millions of records into pretty colored dashboards providing answers to questions we did not ask.
"The new ‘Big Data’ world does add new exciting and frightening data sources"
The reality is big data is not new to universities. We capture and analyze huge amounts of data from our many systems on a daily basis. The new “Big Data” world does add new exciting and frightening data sources. Our structured ERP data is now mixed with messy data from K12, testing sites, social media, and learning management systems. Just because we can track a student’s location via Wi-Fi logs for attendance records, should we not question the violation of basic privacy principles? When we start reviewing non-traditional data elements and integrating them with other universities and social media, we find that our data is often dirty. Vague data definitions only exacerbate the issue. Just try to get three institutions to agree on the definition of something as simple as a tenured faculty member or a full-time student. It is amazing how many different ways you can code a person’s gender.
What is more troubling is the onslaught on campus of analytic consulting firms with slick sales demonstrations. They promise our boards, presidents and provosts “predictive analytics magic” if we just hand-over access to our ERPs and any other piece of data we have laying-around. Not being able to protect data normally causes most CIOs real discourse. Information security and privacy are at the core of our profession. You have to wonder when the companies do not have any question about the data they are analyzing. When I analyze data, I always end up with more questions than answers. No data is clean, and it takes a great deal of care and expertise to provide an authenticate analysis. The real work is the ongoing data cleansing process when we find and correcting the “dirty data.” A researcher prides themselves on their data quality. Until Disney mails us a few cases of fairy dust, we will need a reality check on all the promises of “Big Data.”
With all this apparent discourse, let me first say that I love the power of large data sets and the new analytic tools. Quality data has enormous promise in helping us answer questions that seemed impossible just a few years ago. “Big Data” is helping us think outside our normal academic boxes and ask new questions. We are no longer bound by just reviewing traditional student and financial data (high school GPAs, ACT scores, costs, enrollments and retention rates). These pale against the new student success “Big Data” triggers and predictors. How does the combination and sequence of certain courses produce the best student outcomes? How do you find and remediate students with personalized education resources that are failing due to a poor eighth-grade algebra experience? Scanning social media data provides some of the best insights for customer service and potential new students. How will we find the gold needles in the thousands of data haystacks?
But, many times we focus on minutiae and overlook the obvious. Can we learn from online dating sites and use data to match lonely students with friends that match their profile? They may be our best bet to provide the needed personal and emotional support that will keep a student in school. Can we use a similar matching technique for degrees that match the student skills and passions with faculty who will academically accelerate each student’s unique abilities for success? The real promise and power of “Big Data” on a campus will come down to our ability to personalize academic, institutional and medical information.
We must be imaginative in our information quests while continually questioning the data and answers. It is a great time to be a CIO in Higher Education, especially coming from a medical background. I do believe that we could cure ten percent of the diseases just by asking the right questions from our massive medical data. The tools keep getting easier and more powerful, but there lies the problem. The double edged sword of “Big Data” can provide answers to our most pressing questions, but when used by untrained or unethical individuals can lead to devastating results. Beautiful reports can be quickly generated, but these may or may not be based on sound data principles. We will all be in trouble if we do not instill the core higher education principle of questioning the answers.
Social, mobile and cloud computing are combining to provide students, faculty and staff with more “Big Data” options that help provide new intuitive and predictive information that will help guide them to new learning outcomes. Information resources will continue to be tailored to user needs based on their role and academic acumen. Every method of engaging students is being reviewed. Data, whether big or small, will have a role in the future success of a student or campus. We must not overlook the need to ask the right questions and continue to question the answers.