{"id":3364,"date":"2023-12-17T20:37:54","date_gmt":"2023-12-17T11:37:54","guid":{"rendered":"https:\/\/saraheee.com\/?p=3364"},"modified":"2023-12-17T20:37:56","modified_gmt":"2023-12-17T11:37:56","slug":"dp3-database-anonymization-randomized-response","status":"publish","type":"post","link":"https:\/\/saraheee.com\/ko\/2023\/12\/dp3-database-anonymization-randomized-response\/","title":{"rendered":"[DP#3] Database anonymization \u2013\u00a0randomized response"},"content":{"rendered":"<h2 class=\"wp-block-heading\">Randomized Response Explained<\/h2>\n\n\n\n<p>Did you ever cheat on your spouse?<br>Are you using any illegal drugs?<br>Have you committed any crimes in your past?<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Randomized Response<\/h4>\n\n\n\n<p>S.L. Warner, 1965<br>&#8211; I never said that!<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">How it works<\/h4>\n\n\n\n<p><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-book-reviews-quotation-color\">scenario respondents are asked a sensitive yes or no question a binary question<br>&#8211; Have you ever used illegal drugs?<\/mark><\/p>\n\n\n\n<p><mark style=\"background-color:rgba(0, 0, 0, 0)\" class=\"has-inline-color has-book-reviews-quotation-color\">instead of answering directly respondents are ask to flip a coin in private<br>if the coin comes up heads they answer truthfully<br>if it comes up Tails respondents flip the coin again and answer yes if it lands hats and no if it lands tails<br>in reality you would flip the coin twice in both scenarios<\/mark><\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Data Collection<\/h4>\n\n\n\n<p>Real answer = &#8220;Yes&#8221;<br>3\/4 = 75%<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Advantages<\/h4>\n\n\n\n<p>Increased Truthfulness: \uc751\ub2f5\uc790\uac00 \ubbfc\uac10\ud55c \uc9c8\ubb38\uc5d0 \ub300\ud574 \ubcf4\ub2e4 \uc9c4\uc2e4\ud55c \ub2f5\ubcc0\uc744 \uc81c\uacf5\ud558\ub3c4\ub85d \uc7a5\ub824\ud568(\uc2e4\uc81c \uc751\ub2f5\uc740 \ubb34\uc791\uc704\uc131\uc5d0 \uc758\ud574 \uac00\ub824\uc9c0\ubbc0\ub85c)<br>Privacy Protection: \uc124\ubb38\uc870\uc0ac \ub370\uc774\ud130\uc5d0 \uc811\uadfc\ud558\ub354\ub77c\ub3c4 \uc751\ub2f5\uc790\uc758 \uac1c\uc778\uc815\ubcf4 \ubcf4\ud638\ub97c \ubcf4\uc7a5\ud568<br>Statistical Analysis: \ud1b5\uacc4 \ubd84\uc11d\ud558\uc5ec \uc778\uad6c \uc9d1\ub2e8 \ub0b4\uc758 \ubbfc\uac10\ud55c \ud589\ub3d9\uc774\ub098 \ud2b9\uc131\uc758 \uc720\ubcd1\ub960 \ucd94\uc815<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Challenges<\/h4>\n\n\n\n<p>Randomization Device Choice<br>Sample Size<br>Assumption of Randomness<br>Ethical Concerns<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Applications<\/h4>\n\n\n\n<p>Substance abuse<br>Criminal behavior<br>Sexual preferences<br>Local Differential Privacy<br>RAPPOR<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Conclusion<\/h4>\n\n\n\n<p>Collecting sensitive information<br>Preserving privacy<br>Enhance the quality<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">km-anonymity<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>Name<\/td><td>Age<\/td><td>Sex<\/td><td>ZIP<\/td><td>Disease<\/td><\/tr><tr><td>Alice<\/td><td>28<\/td><td>F<\/td><td>23467<\/td><td>Cancer<\/td><\/tr><tr><td>Bob<\/td><td>17<\/td><td>M<\/td><td>12345<\/td><td>Heart disease<\/td><\/tr><tr><td>Charly<\/td><td>34<\/td><td>M<\/td><td>65490<\/td><td>Flu<\/td><\/tr><tr><td>Dave<\/td><td>41<\/td><td>M<\/td><td>84933<\/td><td>Bronchitis<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\">Identifiers: Name &#8211; removed<br>Quasi Identifiers: Age, Sex, ZIP<br>Sensitive Attribute: Disease<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>Purchases<\/td><td>Purchases(anonymity)<\/td><\/tr><tr><td>Cherries, Apples, Grapes, Tea, Coffee, Diapers<\/td><td>Cherries, Apples, Grapes, Tea, Coffee, Diapers<\/td><\/tr><tr><td>Flour, Coffee, Bread, Butter<\/td><td>Cherries, Apples, Grapes, Bread, Butter<\/td><\/tr><tr><td>Milk, Yoghurt, Brie, Stilton<\/td><td>Milk, Yoghurt, Brie, Stilton<\/td><\/tr><tr><td>Jam, Gouda, Ham, Bacon, Pepperoni<\/td><td>Cherries, Apples, Grapes, Gouda, Ham, Bacon, Pepperoni<\/td><\/tr><tr><td>Tomatoes, Brokkoli, Potatoes, Pasta, Rice<\/td><td>Cherries, Apples, Grapes, Brokkoli, Potatoes, Pasta, Rice<\/td><\/tr><tr><td>Soft Drink, Beer, Fish, Meat, Eggs<\/td><td>Soft Drink, Beer, Fish, Meat, Eggs<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\">\\(k^3\\)-anonymity: same m items(m=3, Cherries, Apples, Grapes)<br>\\(4^3\\)-anonymity<\/figcaption><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\">Generalization<\/h4>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/saraheee.com\/wp-content\/uploads\/2023\/12\/image-12-1024x699.png\" alt=\"\" class=\"wp-image-3381\" width=\"768\" height=\"524\" srcset=\"https:\/\/saraheee.com\/wp-content\/uploads\/2023\/12\/image-12-1024x699.png 1024w, https:\/\/saraheee.com\/wp-content\/uploads\/2023\/12\/image-12-300x205.png 300w, https:\/\/saraheee.com\/wp-content\/uploads\/2023\/12\/image-12-768x524.png 768w, https:\/\/saraheee.com\/wp-content\/uploads\/2023\/12\/image-12-1536x1048.png 1536w, https:\/\/saraheee.com\/wp-content\/uploads\/2023\/12\/image-12-2048x1398.png 2048w\" sizes=\"(max-width: 768px) 100vw, 768px\" \/><\/figure><\/div>\n\n\n<p>I have cherries, apples and grapes which can be generalized to Fruit<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>Purchases(anonymity)<\/td><td>Purchases(generalization)<\/td><\/tr><tr><td>Cherries, Apples, Grapes, Tea, Coffee, Diapers<\/td><td>Fruits, Beverages, Household Items<\/td><\/tr><tr><td>Cherries, Apples, Grapes, Bread, Butter<\/td><td>Fruits, Bread, Dairy<\/td><\/tr><tr><td>Milk, Yoghurt, Brie, Stilton<\/td><td>Dairy<\/td><\/tr><tr><td>Cherries, Apples, Grapes, Gouda, Ham, Bacon, Pepperoni<\/td><td>Fruits, Dairy, Sausages<\/td><\/tr><tr><td>Cherries, Apples, Grapes, Brokkoli, Potatoes, Pasta, Rice<\/td><td>Fruits, Vegetables, Pasta<\/td><\/tr><tr><td>Soft Drink, Beer, Fish, Meat, Eggs<\/td><td>Beverages, Meats<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\">trade-off between Accuracy and Privacy<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Step-By-Step Guide on Implementing Multi-Dimensional Mondrian to Achieve k-anonymity<\/h2>\n\n\n\n<p>python mondrian_youtube.py<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\n\ndata = pd.read_csv('data.csv')\nk = 3\n\nqis = &#091;'Age', 'ZIP', 'Gender']\n\ndef summarized(partition, dim):\n    for qi in qis:\n        partition = partition.sort_values(by=qi)\n\n        if(partition&#091;qi].iloc&#091;0] != partition&#091;qi].iloc&#091;-1]):\n            s = f\"&#091;{partition&#091;qi].iloc&#091;0]} - {partition&#091;qi].iloc&#091;-1]}]\"\n            partition&#091;qi] = &#091;s]*partition&#091;qi].size\n    return partition\n\ndef anonymize(partitions, ranks):\n    dim = ranks&#091;0]&#091;0]\n\n    partition = partition.sort_values(by=dim)\n    si = partition&#091;dim].count()\n    mid = si\/\/2\n\n    lhs = partition&#091;:mid]\n    rhs = partition&#091;mid:]\n\n    if(len(lhs)) >= k and len(rhs) >= k):\n        return pd.concat(&#091;anonymize(lhs, ranks), anonymize(rhs, ranks)])\n    return summarized(partition, dim)\n\ndef mondrian(partitions):\n    ranks = {}\n\n    for qi in qis:\n        ranks&#091;qi] = len(set(partition&#091;qi]))\n\n    # sort ranks\n    ranks = sorted(ranks.items(), key=lambda t: t&#091;1], reverse=True)\n    print(rank)\n\n    return anonymize(partition, ranks)\n\nresult = mondrian(data)\n\nprint(result)\nresult.to_csv('anon_youtube.csv',  index=False)<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">References<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>list: <a href=\"https:\/\/www.youtube.com\/playlist?list=PLZeK3TZueogEhGK0kTztL5ALQ_MkxgFCv\" rel=\"noopener\">https:\/\/www.youtube.com\/playlist?list=PLZeK3TZueogEhGK0kTztL5ALQ_MkxgFCv<\/a><\/li>\n\n\n\n<li>Security and Privacy Academy, (9\/11) Randomized Response Explained, Oct 25, 2023, <a href=\"https:\/\/youtu.be\/SD7EzSkBXug?si=cFuxx8FFWUM0lmeX\" rel=\"noopener\">https:\/\/youtu.be\/SD7EzSkBXug?si=cFuxx8FFWUM0lmeX<\/a><\/li>\n\n\n\n<li>Security and Privacy Academy, (10\/11) km-anonymity, Dec 14, 2023, <a href=\"https:\/\/youtu.be\/v7lruU9gIOg?si=95nbGq8BY2ApzOkr\" rel=\"noopener\">https:\/\/youtu.be\/v7lruU9gIOg?si=95nbGq8BY2ApzOkr<\/a><\/li>\n\n\n\n<li>Security and Privacy Academy, (11\/11) Step-By-Step Guide on Implementing Multi-Dimensional Mondrian to Achieve k-anonymity, <a href=\"https:\/\/youtu.be\/aky-3MfeJ4E?si=OIBplNgeVrmArujH\" rel=\"noopener\">https:\/\/youtu.be\/aky-3MfeJ4E?si=OIBplNgeVrmArujH<\/a><\/li>\n<\/ul>\n\n\n\n<p><\/p>","protected":false},"excerpt":{"rendered":"<p>Randomized response techniques have been developed and applied, allowing researchers to gather sensitive information without compromising respondent privacy or truthfulness.<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[154],"tags":[160,164,155,165],"class_list":["post-3364","post","type-post","status-publish","format-standard","hentry","category-dp","tag-database-anonymization","tag-dec-17-2023","tag-differential-privacy","tag-randomized-response"],"_links":{"self":[{"href":"https:\/\/saraheee.com\/ko\/wp-json\/wp\/v2\/posts\/3364"}],"collection":[{"href":"https:\/\/saraheee.com\/ko\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/saraheee.com\/ko\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/saraheee.com\/ko\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/saraheee.com\/ko\/wp-json\/wp\/v2\/comments?post=3364"}],"version-history":[{"count":22,"href":"https:\/\/saraheee.com\/ko\/wp-json\/wp\/v2\/posts\/3364\/revisions"}],"predecessor-version":[{"id":3390,"href":"https:\/\/saraheee.com\/ko\/wp-json\/wp\/v2\/posts\/3364\/revisions\/3390"}],"wp:attachment":[{"href":"https:\/\/saraheee.com\/ko\/wp-json\/wp\/v2\/media?parent=3364"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/saraheee.com\/ko\/wp-json\/wp\/v2\/categories?post=3364"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/saraheee.com\/ko\/wp-json\/wp\/v2\/tags?post=3364"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}