Module 1 - Business Overview of Why Big Data Business Intelligence in Telco.
- Case Studies from telecom operators like T-Mobile, Verizon etc.
- Big Data adaptation rate in Northern American Telco & and how they are aligning their future business model and operation around
Big Data BI
- Broad Scale Application Area
- Network and Service management
- Customer Churn Management
- Data Integration & Dashboard visualization
- Fraud management
- Business Rule generation
- Customer profiling
- Localized Ad pushing
Module 2 - Introduction of Big Data
- Main characteristics of Big Data-volume, variety, velocity and veracity. MPP architecture for volume.
- Data Warehouses – static schema, slowly evolving dataset
- MPP Databases like Greenplum, Exadata, Teradata, Netezza, Vertica etc.
- Hadoop Based Solutions – no conditions on structure of dataset.
- Typical pattern : HDFS, MapReduce (crunch), retrieve from HDFS
- Batch- suited for analytical/non-interactive
- Volume : CEP streaming data
- Typical choices – CEP products (e.g. Infostreams, Apama, MarkLogic etc)
- Less production ready – Storm/S4
- NoSQL Databases – (columnar and key-value): Best suited as analytical adjunct to data warehouse/database
- NoSQL solutions
- KV Store - Keyspace, Flare, SchemaFree, RAMCloud, Oracle NoSQL Database (OnDB)
- KV Store - Dynamo, Voldemort, Dynomite, SubRecord, Mo8onDb, DovetailDB
- KV Store (Hierarchical) - GT.m, Cache
- KV Store (Ordered) - TokyoTyrant, Lightcloud, NMDB, Luxio, MemcacheDB, Actord
- KV Cache - Memcached, Repcached, Coherence, Infinispan, EXtremeScale, JBossCache, Velocity, Terracoqua
- Tuple Store - Gigaspaces, Coord, Apache River
- Object Database - ZopeDB, DB40, Shoal
- Document Store - CouchDB, Cloudant, Couchbase, MongoDB, Jackrabbit, XML-Databases, ThruDB, CloudKit, Prsevere, Riak-
Basho, Scalaris
- Wide Columnar Store - BigTable, HBase, Apache Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI
Module 3 - Introduction to Data Cleaning issue in Big Data
- RDBMS – static structure/schema, doesn’t promote agile, exploratory environment.
- NoSQL – semi structured, enough structure to store data without exact schema before storing data
- Data cleaning issues
Module 4 - Big Data - Hadoop
- When to select Hadoop?
- STRUCTURED - Enterprise data warehouses/databases can store massive data (at a cost) but impose structure (not good for active
exploration)
- SEMI STRUCTURED data – tough to do with traditional solutions (DW/DB)
- Warehousing data = HUGE effort and static even after implementation
- For variety & volume of data, crunched on commodity hardware – HADOOP
- Commodity H/W needed to create a Hadoop Cluster
Module 5 - Introduction to Map Reduce /HDFS
- MapReduce – distribute computing over multiple servers
- HDFS – make data available locally for the computing process (with redundancy)
- Data – can be unstructured/schema-less (unlike RDBMS)
- Developer responsibility to make sense of data
- Programming MapReduce = working with Java (pros/cons), manually loading data into HDFS
Module 6 - Spark : In Memory distributed database
- What is “In memory” processing?
- Spark SQL
- Spark SDK
- Spark API
- RDD
- Spark Lib
- Hanna
- How to migrate an existing Hadoop system to Spark
Module 7 - Storm -Real time processing in Big Data
- Streams
- Sprouts
- Bolts
- Topologies
Module 8 - Big Data Management System
- Moving parts, compute nodes start/fail :ZooKeeper - For configuration/coordination/naming services
- Complex pipeline/workflow: Oozie – manage workflow, dependencies, daisy chain
- Deploy, configure, cluster management, upgrade etc (sys admin) :Ambari
- In Cloud : Whirr
- Evolving Big Data platform tools for tracking
- ETL layer application issues
Module 9 - Predictive analytics in Business Intelligence -1: Fundamental Techniques & Machine learning based BI :
- Introduction to Machine learning
- Learning classification techniques
- Bayesian Prediction-preparing training file
- Markov random field
- Supervised and unsupervised learning
- Feature extraction
- Support Vector Machine
- Neural Network
- Reinforcement learning
- Big Data large variable problem -Random forest (RF)
- Representation learning
- Deep learning
- Big Data Automation problem – Multi-model ensemble RF
- Automation through Soft10-M
- LDA and topic modeling
- Agile learning
- Agent based learning- Example from Telco operation
- Distributed learning –Example from Telco operation
- Introduction to Open source Tools for predictive analytics : R, Rapidminer, Mahut
- More scalable Analytic-Apache Hama, Spark and CMU Graph lab
Module 10 - Predictive analytics eco-system-2: Common predictive analytic problems in Telecom
- Insight analytic
- Visualization analytic
- Structured predictive analytic
- Unstructured predictive analytic
- Customer profiling
- Recommendation Engine
- Pattern detection
- Rule/Scenario discovery –failure, fraud, optimization
- Root cause discovery
- Sentiment analysis
- CRM analytic
- Network analytic
- Text Analytics
- Technology assisted review
- Fraud analytic
- Real Time Analytic
Module 11 - Network Operation analytic- root cause analysis of network failures, service interruption from meta data, IPDR and
CRM:
- CPU Usage
- Memory Usage
- QoS Queue Usage
- Device Temperature
- Interface Error
- IoS versions
- Routing Events
- Latency variations
- Syslog analytics
- Packet Loss
- Load simulation
- Topology inference
- Performance Threshold
- Device Traps
- IPDR ( IP detailed record) collection and processing
- Use of IPDR data for Subscriber Bandwidth consumption, Network interface utilization, modem status and diagnostic
- HFC information
Module 12 - Tools for Network service failure analysis:
- Network Summary Dashboard: monitor overall network deployments and track your organization's key performance indicators
- Peak Period Analysis Dashboard: understand the application and subscriber trends driving peak utilization, with location-specific
granularity
- Routing Efficiency Dashboard: control network costs and build business cases for capital projects with a complete understanding of
interconnect and transit relationships
- Real-Time Entertainment Dashboard: access metrics that matter, including video views, duration, and video quality of experience
(QoE)
- IPv6 Transition Dashboard: investigate the ongoing adoption of IPv6 on your network and gain insight into the applications and
devices driving trends
- Case-Study-1: The Alcatel-Lucent Big Network Analytics (BNA) Data Miner
- Multi-dimensional mobile intelligence (m.IQ6)
Module 13 - Big Data BI for Marketing/Sales –Understanding sales/marketing from Sales data
- To identify highest velocity clients
- To identify clients for a given products
- To identify right set of products for a client ( Recommendation Engine)
- Market segmentation technique
- Cross-Sale and upsale technique
- Client segmentation technique
- Sales revenue forecasting technique
Module 14 - BI needed for Telco CFO office
- Overview of Business Analytics works needed in a CFO office
- Risk analysis on new investment
- Revenue, profit forecasting
- New client acquisition forecasting
- Loss forecasting
- Fraud analytic on finances
Module 15 - Fraud prevention BI from Big Data in Telco-Fraud analytic
- Bandwidth leakage / Bandwidth fraud
- Vendor fraud/over charging for projects
- Customer refund/claims frauds
- Travel reimbursement frauds
Module 16 - From Churning Prediction to Churn Prevention
- 3 Types of Churn : Active/Deliberate , Rotational/Incidental, Passive Involuntary
- 3 classification of churned customers: Total, Hidden, Partial
- Understanding CRM variables for churn
- Customer behavior data collection
- Customer perception data collection
- Customer demographics data collection
- Cleaning CRM Data
- Unstructured CRM data ( customer call, tickets, emails) and their conversion to structured data for Churn analysis
- Social Media CRM-new way to extract customer satisfaction index
- Case Study-1 : T-Mobile USA: Churn Reduction by 50%
Module 17 - How to use predictive analysis for root cause analysis of customer dis-satisfaction
- Case Study -1 : Linking dissatisfaction to issues – Accounting, Engineering failures like service interruption, poor bandwidth service
- Case Study-2: Big Data QA dashboard to track customer satisfaction index from various parameters such as call escalations, criticality
of issues, pending service interruption events etc.
Module 18 - Big Data Dashboard for quick accessibility of diverse data and display
- Integration of existing application platform with Big Data Dashboard
- Big Data management
- Case Study of Big Data Dashboard: Tableau and Pentaho
- Use Big Data app to push location based Advertisement
- Tracking system and management
Module 19 - How to justify Big Data BI implementation within an organization:
- Defining ROI for Big Data implementation
- Case studies for saving Analyst Time for collection and preparation of Data –increase in productivity gain
- Case studies of revenue gain from customer churn
- Revenue gain from location based and other targeted Ad
- An integrated spreadsheet approach to calculate approx. expense vs. Revenue gain/savings from Big Data implementation
Module 20 - Step by Step procedure to replace legacy data system to Big Data System:
- Understanding practical Big Data Migration Roadmap
- What are the important information needed before architecting a Big Data implementation
- What are the different ways of calculating volume, velocity, variety and veracity of data
- How to estimate data growth
- Case studies in 2 Telco