Calendar Site

- This event has passed.

The last decade has seen an explosion in the quantum of data available on different systems. This has led to a growth in the development and use of techniques for mining this data for extracting valuable information. The spectrum of applications include speech and image processing, biomedical signal processing, bioinformatics, envirometrics and chemometrics. Principal Components Analysis (PCA) is one of the popular techniques used in many of these applications. PCA has primarily been used to reduce the dimensionality of multivariate data. What is less well known is the fact that it can be used to identify the linear model (constraints) relating the variables. Under some additional mild conditions, we show how an iterative PCA technique can be used to simultaneously identify the linear model and estimate the noise variances that corrupt the measured data. The number of independent variables and an appropriate subset of independent variables can also be precisely identified using IPCA. If the data is from a linear conservative network (such as a flow network), then it is also possible to re-construct the network purely from data. A flow network is used to illustrate the extraction of such useful information from data. PCA has also been used to obtain calibration models for predicting the concentration of a mixture given its absorbance spectra. A closely related data analysis technique is non-negative matrix factorization (NMF), which can be used to estimate not only the concentrations but also extract the pure component spectra from mixture spectra. Similar to IPCA we show the use of an iterative NMF technique to extract the noise variances, relative concentrations, and pure component spectra only from mixture spectral data.