{"id":590,"date":"2015-01-31T13:31:21","date_gmt":"2015-01-31T18:31:21","guid":{"rendered":"http:\/\/josephpcohen.com\/w\/?p=590"},"modified":"2017-03-28T00:39:54","modified_gmt":"2017-03-28T04:39:54","slug":"wrappersearchspace","status":"publish","type":"post","link":"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/","title":{"rendered":"Visualization of Attribute Subset Wrapper Search Space"},"content":{"rendered":"<p>Many researchers working on attribute subset selection focus on finding the most discriminating subset suboptimal solution using various methods such as greedy search, best first, genetic algorithms, feature filtering, feature clustering, boosting, markov blankets, mutual information, entropy, and many more. One problem with judging the approach of these algorithms is whether they are appropriate for the problem space. Here I attempt to visualize the 2^n space in order to gain a deeper understanding of the methods that will work.<\/p>\n<p><!--more--><\/p>\n<p>Below is a Kohavi diagram representing the wrapper subset search space for 4 features. 1111 contains all features, 0000 contains no features, and each line represents the addition or removal of a feature.<\/p>\n<p><a href=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/Screen-Shot-2015-01-31-at-12.36.50-PM.png\"><img decoding=\"async\" class=\"aligncenter size-large wp-image-593\" src=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/Screen-Shot-2015-01-31-at-12.36.50-PM-1024x482.png\" alt=\"Screen Shot 2015-01-31 at 12.36.50 PM\" width=\"70%\" srcset=\"https:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/Screen-Shot-2015-01-31-at-12.36.50-PM-1024x482.png 1024w, https:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/Screen-Shot-2015-01-31-at-12.36.50-PM-300x141.png 300w, https:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/Screen-Shot-2015-01-31-at-12.36.50-PM-638x300.png 638w, https:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/Screen-Shot-2015-01-31-at-12.36.50-PM.png 1480w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n<p>\nThe diagrams below shows the complete wrapper search spaces for various datasets from UCI. Each point represents accuracy of that subset represented by the F1-Score of 5 fold cross validation. Per Kohavi, the cross validation is repeated until the standard deviation is below 1% or a max of 5 times. The mean is what is shown. The classifier used is Naive Bayes.<\/p>\n<p>The colors represent the F1-Score and are scaled to show more contrast. Blue represents the minimum value to the mean and red represents the mean to the max value. The code is here:<\/p>\n<pre>\r\noval = rows[x].get(y);\r\nsoval = (oval-min)\/(max-min);\r\nval = max(0,(int) (soval*255.0*2))-255;\r\nr = max(min(255,val),0);\r\ng = 0;\r\nb = max(min(255,-1*val),0);\r\n<\/pre>\n<p>File: <a href=\"http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/iris.arff\">http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/iris.arff<\/a><br \/>\nFeatures: 4<br \/>\n<a href=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-iris-NaiveBayes-f1.png\"><img decoding=\"async\" src=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-iris-NaiveBayes-f1.png\" alt=\"viz-car\" class=\"aligncenter size-full wp-image-597\" height=30% \/><\/a><\/p>\n<p>File: <a href=\"http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/tae.arff\">http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/tae.arff<\/a><br \/>\nFeatures: 5<br \/>\n<a href=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-tae-NaiveBayes-f1.png\"><img decoding=\"async\" src=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-tae-NaiveBayes-f1.png\" alt=\"viz-car\" class=\"aligncenter size-full wp-image-597\" width=50% \/><\/a><\/p>\n<p>File: <a href=\"http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/car.arff\">http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/car.arff<\/a><br \/>\nFeatures: 6<\/p>\n<p><a href=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-car-NaiveBayes-f1.png\"><img decoding=\"async\" src=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-car-NaiveBayes-f1.png\" alt=\"viz-car\" class=\"aligncenter size-full wp-image-597\" width=100% \/><\/a><\/p>\n<p>File: <a href=\"http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/ecoli.arff\">http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/ecoli.arff<\/a><br \/>\nFeatures: 7<\/p>\n<p>F1-Score<br \/>\n<a href=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-ecoli-NaiveBayes-f1.png\"><img decoding=\"async\" src=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-ecoli-NaiveBayes-f1.png\" alt=\"viz-car\" class=\"aligncenter size-full wp-image-597\" width=100% \/><\/a><\/p>\n<p>File: <a href=\"http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/diabetes.arff\">http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/diabetes.arff<\/a><br \/>\nFeatures: 8<\/p>\n<p>F1-Score<br \/>\n<a href=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-pima_diabetes-NaiveBayes-f1.png\"><img decoding=\"async\" src=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-pima_diabetes-NaiveBayes-f1.png\" alt=\"viz-car\" class=\"aligncenter size-full wp-image-597\" width=100% \/><\/a><\/p>\n<p>File: <a href=\"http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/bridges_version1.arff\">http:\/\/repository.seasr.org\/Datasets\/UCI\/arff\/bridges_version1.arff<\/a><br \/>\nFeatures: 10<br \/>\nF1-Score<br \/>\n<a href=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-bridges-version1-NaiveBayes-f1.png\"><img decoding=\"async\" src=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-bridges-version1-NaiveBayes-f1.png\" alt=\"viz-car\" class=\"aligncenter size-full wp-image-597\" width=100% \/><\/a><\/p>\n<p>The most interesting dataset was the monks-2.train dataset because between two different classifiers it doesn&#8217;t appear to have many of the same optimal subsets. Also it doesn&#8217;t appear to have paths of high scoring from 11.1 to a better subset. For the NaiveBayes Classifier the scores were min\/max=0.459\/0.558. For the J48 the scores were min\/max=0.476\/0.574<\/p>\n<p>File: <a href=\"https:\/\/ml-dolev-amit.googlecode.com\/svn\/trunk\/weka\/optional_datasets\/monks-2.train.arff\">https:\/\/ml-dolev-amit.googlecode.com\/svn\/trunk\/weka\/optional_datasets\/monks-2.train.arff<\/a><br \/>\nFeatures: 10<br \/>\nF1-Score<\/p>\n<p><center><\/p>\n<table>\n<tr>\n<td>\n<a href=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-monks-2.train-NaiveBayes-f1.png\"><img decoding=\"async\" src=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-monks-2.train-NaiveBayes-f1.png\" alt=\"viz-car\" class=\"aligncenter size-full wp-image-597\" width=100% \/><\/a>\n<\/td>\n<td>\n<a href=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-monks-2.train-J48f1.png\"><img decoding=\"async\" src=\"http:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/viz-monks-2.train-J48-f1.png\" alt=\"viz-car\" class=\"aligncenter size-full wp-image-597\" width=100% \/><\/a>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<center>NaiveBayes<\/center>\n<\/td>\n<td>\n<center>J48<\/center>\n<\/td>\n<\/tr>\n<\/table>\n<p><\/center><\/p>\n","protected":false},"excerpt":{"rendered":"<div class=\"mh-excerpt\"><p>Many researchers working on attribute subset selection focus on finding the most discriminating subset suboptimal solution using various methods such as greedy search, best first, <a class=\"mh-excerpt-more\" href=\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/\" title=\"Visualization of Attribute Subset Wrapper Search Space\">[&#8230;]<\/a><\/p>\n<\/div>","protected":false},"author":1,"featured_media":593,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[13,1],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.1 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Visualization of Attribute Subset Wrapper Search Space - Joseph Paul Cohen PhD<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Visualization of Attribute Subset Wrapper Search Space - Joseph Paul Cohen PhD\" \/>\n<meta property=\"og:description\" content=\"Many researchers working on attribute subset selection focus on finding the most discriminating subset suboptimal solution using various methods such as greedy search, best first, [...]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/\" \/>\n<meta property=\"og:site_name\" content=\"Joseph Paul Cohen PhD\" \/>\n<meta property=\"article:published_time\" content=\"2015-01-31T18:31:21+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2017-03-28T04:39:54+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/Screen-Shot-2015-01-31-at-12.36.50-PM.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1480\" \/>\n\t<meta property=\"og:image:height\" content=\"696\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Joseph Paul Cohen\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Joseph Paul Cohen\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/\"},\"author\":{\"name\":\"Joseph Paul Cohen\",\"@id\":\"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/e25d0d5746952220f35d182ca7aa8684\"},\"headline\":\"Visualization of Attribute Subset Wrapper Search Space\",\"datePublished\":\"2015-01-31T18:31:21+00:00\",\"dateModified\":\"2017-03-28T04:39:54+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/\"},\"wordCount\":358,\"publisher\":{\"@id\":\"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/e25d0d5746952220f35d182ca7aa8684\"},\"articleSection\":[\"References\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/\",\"url\":\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/\",\"name\":\"Visualization of Attribute Subset Wrapper Search Space - Joseph Paul Cohen PhD\",\"isPartOf\":{\"@id\":\"https:\/\/josephpcohen.com\/w\/#website\"},\"datePublished\":\"2015-01-31T18:31:21+00:00\",\"dateModified\":\"2017-03-28T04:39:54+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/josephpcohen.com\/w\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Visualization of Attribute Subset Wrapper Search Space\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/josephpcohen.com\/w\/#website\",\"url\":\"https:\/\/josephpcohen.com\/w\/\",\"name\":\"Joseph Paul Cohen PhD\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/e25d0d5746952220f35d182ca7aa8684\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/josephpcohen.com\/w\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/e25d0d5746952220f35d182ca7aa8684\",\"name\":\"Joseph Paul Cohen\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/a810b57939e75247f570c9094e7bd16e?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/a810b57939e75247f570c9094e7bd16e?s=96&d=mm&r=g\",\"caption\":\"Joseph Paul Cohen\"},\"logo\":{\"@id\":\"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Visualization of Attribute Subset Wrapper Search Space - Joseph Paul Cohen PhD","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/","og_locale":"en_US","og_type":"article","og_title":"Visualization of Attribute Subset Wrapper Search Space - Joseph Paul Cohen PhD","og_description":"Many researchers working on attribute subset selection focus on finding the most discriminating subset suboptimal solution using various methods such as greedy search, best first, [...]","og_url":"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/","og_site_name":"Joseph Paul Cohen PhD","article_published_time":"2015-01-31T18:31:21+00:00","article_modified_time":"2017-03-28T04:39:54+00:00","og_image":[{"width":1480,"height":696,"url":"https:\/\/josephpcohen.com\/w\/wp-content\/uploads\/2015\/01\/Screen-Shot-2015-01-31-at-12.36.50-PM.png","type":"image\/png"}],"author":"Joseph Paul Cohen","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Joseph Paul Cohen","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/#article","isPartOf":{"@id":"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/"},"author":{"name":"Joseph Paul Cohen","@id":"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/e25d0d5746952220f35d182ca7aa8684"},"headline":"Visualization of Attribute Subset Wrapper Search Space","datePublished":"2015-01-31T18:31:21+00:00","dateModified":"2017-03-28T04:39:54+00:00","mainEntityOfPage":{"@id":"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/"},"wordCount":358,"publisher":{"@id":"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/e25d0d5746952220f35d182ca7aa8684"},"articleSection":["References"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/","url":"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/","name":"Visualization of Attribute Subset Wrapper Search Space - Joseph Paul Cohen PhD","isPartOf":{"@id":"https:\/\/josephpcohen.com\/w\/#website"},"datePublished":"2015-01-31T18:31:21+00:00","dateModified":"2017-03-28T04:39:54+00:00","breadcrumb":{"@id":"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/josephpcohen.com\/w\/wrappersearchspace\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/josephpcohen.com\/w\/wrappersearchspace\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/josephpcohen.com\/w\/"},{"@type":"ListItem","position":2,"name":"Visualization of Attribute Subset Wrapper Search Space"}]},{"@type":"WebSite","@id":"https:\/\/josephpcohen.com\/w\/#website","url":"https:\/\/josephpcohen.com\/w\/","name":"Joseph Paul Cohen PhD","description":"","publisher":{"@id":"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/e25d0d5746952220f35d182ca7aa8684"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/josephpcohen.com\/w\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/e25d0d5746952220f35d182ca7aa8684","name":"Joseph Paul Cohen","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/a810b57939e75247f570c9094e7bd16e?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a810b57939e75247f570c9094e7bd16e?s=96&d=mm&r=g","caption":"Joseph Paul Cohen"},"logo":{"@id":"https:\/\/josephpcohen.com\/w\/#\/schema\/person\/image\/"}}]}},"_links":{"self":[{"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/posts\/590"}],"collection":[{"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/comments?post=590"}],"version-history":[{"count":53,"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/posts\/590\/revisions"}],"predecessor-version":[{"id":950,"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/posts\/590\/revisions\/950"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/media\/593"}],"wp:attachment":[{"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/media?parent=590"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/categories?post=590"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/josephpcohen.com\/w\/wp-json\/wp\/v2\/tags?post=590"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}