This shows you the differences between two versions of the page.

Both sides previous revision Previous revision Next revision | Previous revision | ||

research [2014/03/25 13:43] Nicolas Pasquier |
research [2015/07/06 17:27] (current) Frederic Precioso [Boosting] |
||
---|---|---|---|

Line 1: | Line 1: | ||

====== Research Topics ====== | ====== Research Topics ====== | ||

- | MinD research group aims at developping algorithms for data mining and machine learning with a focus on large-scale. In particular, MinD has an expertise on [[research#Concept Lattices|Concept Lattices]], [[research#Evolutionary Computation|Evolutionary Computation]], [[research#Multi-Agent Systems|Multi-Agent Systems]], [[research#Naïve Bayes|Naïve Bayes]], [[research#Random Forests|Random Forests]], [[research#Support Vector Machines|Support Vector Machines]], ... | + | MinD research group aims at developping algorithms for data mining and machine learning with a focus on large-scale. In particular, MinD has an expertise on [[research#Concept Lattices|Concept Lattices]], [[research#Evolutionary Computation|Evolutionary Computation]], [[research#Multi-Agent Systems|Multi-Agent Systems]], [[research#Naïve Bayes|Naïve Bayes]], [[research#Random Forests|Random Forests]], [[research#Support Vector Machines|Support Vector Machines]], [[research#Boosting|Boosting]], [[research#Deep Learning|Deep Learning]], ... |

Those methods are used to extract knowledge from [[wp>Big Data]] for : | Those methods are used to extract knowledge from [[wp>Big Data]] for : | ||

* Association rule learning | * Association rule learning | ||

Line 13: | Line 13: | ||

---- | ---- | ||

===== Concept Lattices ====== | ===== Concept Lattices ====== | ||

- | {{ :concept_lattices.png?500|}} | + | {{ :concept_lattices.png?500|Example dataset and corresponding concept lattice}} |

- | Concepts lattices are theoretical structures defined according to the [[http://en.wikipedia.org/wiki/Galois_connection|Galois connection]] of a finite binary relation. | + | Concept lattices are theoretical structures defined according to the [[http://en.wikipedia.org/wiki/Galois_connection|Galois connection]] of a finite binary relation. |

Given a set of instances (objects) described by a list of properties (variables values), the concept lattice is a hierarchy of concepts in which each concept associates a set of instances (extent) sharing the same value for a certain set of properties (intent). | Given a set of instances (objects) described by a list of properties (variables values), the concept lattice is a hierarchy of concepts in which each concept associates a set of instances (extent) sharing the same value for a certain set of properties (intent). | ||

Concepts are partially ordered in the lattice according to the inclusion relation: Each sub-concept in the lattice contains a subset of the instances and a superset of the properties in the related concepts above it. | Concepts are partially ordered in the lattice according to the inclusion relation: Each sub-concept in the lattice contains a subset of the instances and a superset of the properties in the related concepts above it. | ||

- | In data mining, concept lattices serve as a theoretical framework for the efficient extraction of non-redundant loss-less condensed representations of [[http://en.wikipedia.org/wiki/Association_rule_learning|association rules]] and hierarchical [[http://en.wikipedia.org/wiki/Biclustering|biclustering]]. | + | In data mining, concept lattices serve as a theoretical framework for the efficient extraction of loss-less condensed representations of [[http://en.wikipedia.org/wiki/Association_rule_learning|association rules]], the generation of [[http://en.wikipedia.org/wiki/Classification_rule|classification rules]], and for hierarchical [[http://en.wikipedia.org/wiki/Biclustering|biclustering]]. |

---- | ---- | ||

===== Evolutionary Computation ===== | ===== Evolutionary Computation ===== | ||

- | ---- | + | {{ea.png?450 }} |

- | {{ ea.png?450}} | + | |

[[wp>Evolutionary_Algorithm|Evolutionary Algorithms (EA)]] are nature inspired and stochastic algorithms that mimic Darwin theory for problem optimization. | [[wp>Evolutionary_Algorithm|Evolutionary Algorithms (EA)]] are nature inspired and stochastic algorithms that mimic Darwin theory for problem optimization. | ||

The particularity of EA is its capacity to deal with multi objectives (i.e. maximizing profits while minimizing costs), multi-modality (several best solutions) as the algorithm considers a population of solutions, discrete or continous optimization, dynamic optimization and many others fundamental problems... | The particularity of EA is its capacity to deal with multi objectives (i.e. maximizing profits while minimizing costs), multi-modality (several best solutions) as the algorithm considers a population of solutions, discrete or continous optimization, dynamic optimization and many others fundamental problems... | ||

Line 31: | Line 30: | ||

---- | ---- | ||

===== Random Forests ====== | ===== Random Forests ====== | ||

+ | |||

+ | |||

+ | |||

+ | We use [[wp>Random Forests (RF)]] for classification and prediction in many different fields. They are quite efficient with high dimensional data and results obtained are better than other classical methods. RF is a supervised approach: the sample used for creating the trees are labeled and separate in two subsets, a training set and a test set. The training set is used to construct the trees of the forest, and the test set is used to validate the created forest. | ||

+ | Two specific techniques are often used in the process of construction of the trees: Bootstrap Aggregating (Bagging) to select a subpart of the training set for each tree, and Random Feature Selection to select a subpart of the features characterizing each sample. The best feature to choose in a node of one tree is selected thanks to this subpart. | ||

+ | We use RF for short-text classification, body-action recognition with one and several Kinects, classification and prediction of coastal current, classification and prediction for air pollution, prediction for auto-adaptation for many sensor (ubiquitus programming). | ||

+ | |||

---- | ---- | ||

===== Support Vector Machines ====== | ===== Support Vector Machines ====== | ||

+ | In machine learning, [[wp>support vector machines]] (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. Given a set of training examples, each marked for belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a non-probabilistic binary linear classifier. | ||

+ | In addition to performing linear classification, SVMs can efficiently perform a non-linear classification using what is called the kernel trick, implicitly mapping their inputs into high-dimensional feature spaces. | ||

+ | ---- | ||

+ | ===== Boosting ====== | ||

+ | [[https://en.wikipedia.org/wiki/Boosting_(machine_learning)|Boosting]] is a machine learning ensemble meta-algorithm for reducing bias primarily and also variance[1] in supervised learning, and a family of machine learning algorithms which convert weak learners to strong ones.[2] Boosting is based on the question posed by Kearns and Valiant (1988, 1989):[3][4] Can a set of weak learners create a single strong learner? A weak learner is defined to be a classifier which is only slightly correlated with the true classification (it can label examples better than random guessing). In contrast, a strong learner is a classifier that is arbitrarily well-correlated with the true classification. | ||

+ | ---- | ||

+ | ===== Deep Learning ====== | ||

---- | ---- |