School of Information Sciences - Information Science & Technology Program

Big Data Analytics Specialization

“Big Data” refers to sets of data that are so large and complex, it is difficult to use them effectively and efficiently. Thanks to advances in technology, the amount of available information increases daily. In the next few years, we will be routinely trying to use petabytes of data stored in multiple formats across different platforms. The volume and diversity of data make it extremely challenging to store, retrieve, analyze and utlize this information. Businesses, government agencies, and society needs experts with the skills and knowledge to design, develop and deploy complex information systems and applications that deal with multi-terabyte data sets. The iSchool has created a specialization in the MSIS program that emphasizes big data analytics and provides students with the essential in-depth knowledge of techniques and technologies relevant for big data management. Coursework will cover the design and maintenance of infrastructure to efficiently store, easily access, and transfer over wide area networks, extremely large amounts of data.

The Big Data challenge involves three major dimensions: data size, data rate and data diversity: the iSchool’s Big Data Analytics specialization will prepare students to address real-life problems along each of those dimensions.  For instance, it is not uncommon for digital archives to store terabytes and even petabytes of data in hundreds of data repositories supporting thousands of applications. Maintaining such data repositories requires knowledge in ultra-large scale distributed systems, virtualization technologies, cloud computing, unstructured and semi-structured data management, optimization methods based on data replication and data migration, as well as in advanced data protection techniques. The exponential growth of the amount of data calls for competence in advanced dynamic data processing techniques, including scalable data processing methods and technologies; data stream management; and large-scale process monitoring, modeling and mining. In order to comprehensively analyze such volumes of information from disparate and various disciplines, information professionals will need to master advanced data integration techniques and business intelligence tools, crowdsourcing technologies, large-scale information fusion, data-intensive computation and semantic data management.

In a Computerworld article published in November of 2012, the IT employment firm Gartner estimated that 4.4 million IT jobs will be created in the area of big data between now and 2015. However, Gartner’s head of research, Peter Sondergaard, notes a serious shortage of IT professionals with big-data skills: “There is not enough talent in the industry” and that only one-third of the new jobs will be filled. The new Big Data Analytics specialization will prepare graduates to provide the much-needed expertise to advance this burgeoning field.

Why will there be such a significant need for Big Data Analysts and specialists? Because every industry sector and service entity has to deal with Big Data or can benefit from corralling the power of so much information. Obviously, those who work in data-rich disciplines such as astronomy or fields including online retail would depend on the tools and technologies in Big Data management. However, digital data is everywhere and employers from a wide range of sectors (healthcare, finance, place-based retail, manufacturing, and transportation, to name just a few) will be looking to build workforce capacity to enhance their productivity and competitive position in global markets.

Lead Faculty

Pre-requisites for this Specialization

Students must have taken IS 2500 Data Structures or an equivalent as well as a course in the JAVA Programming Language prior to entering the Big Data Analytics specialization. This is in addition to the other pre-requisites for the MSIS program listed here.

Specialization Plan of Study

Any changes to the distribution of credits below must be requested, in advance, through petition to the GIST faculty.

Note: Recommended courses have been pre-approved to fulfill the following academic areas. You may choose classes from outside of the list of recommended courses and are encouraged to discuss your options with your academic advisor.

Mathematical and Formal Foundations area (6 credits)

Required courses:

  • INFSCI 2160 Data Mining
  • INFSCI 2591 Algorithm Design

Cognitive Science area (6 credits)

Recommended courses:

  • INFSCI 2410 Introduction to Neural Networks
  • INFSCI 2415 Information Visualization
  • INFSCI 2430 Social Computing
  • INFSCI 2480 Adaptive Information Systems

Systems and Technology area (18 credits)

Required courses:

  • INFSCI 2710 Database Management
  • INFSCI 2725 Data Analytics
  • Either INFSCI 2711 Advanced Topics in Database Management OR INFSCI 2750 Cloud Computing

Recommended courses:

  • INFSCI 2150 Security and Privacy
  • INFSCI 2711 Advanced Topics in Database Management
  • INFSCI 2750 Cloud Computing
  • TELCOM 2120 Network Performance
  • TELCOM 2310 Computer Networks

Electives (6 credits)

The electives can be chosen to meet the individual needs of the student and may include classes in Advanced Statistics and domain-specific areas.

Recommended courses:

  • INFSCI 2140 Information Storage and Retrieval
  • INFSCI 2135 Probabilistic Methods for Computer-based Decision Support
  • INFSCI 2801 Geospatial Information Systems
  • INFSCI 2802 Mobile GIS and Location-Based Services
  • INFSCI 2809 Spatial Data Analytics
  • INFSCI 2821 Introduction to Biomedical Informatics
  • INFSCI 2915 Special Topics: Machine Learning

City of Opportunities

Pittsburgh is big enough that opportunities—for internships, jobs, and access to a rich cultural scene as well as the outdoors—abound.