{"id":438,"date":"2026-02-22T21:58:53","date_gmt":"2026-02-22T21:58:53","guid":{"rendered":"https:\/\/ffritze.de\/?p=438"},"modified":"2026-04-21T20:48:08","modified_gmt":"2026-04-21T20:48:08","slug":"q-learning-unplugged","status":"publish","type":"post","link":"https:\/\/ffritze.de\/en\/q-learning-unplugged\/","title":{"rendered":"Q-Learning \u2013 unplugged"},"content":{"rendered":"<p class=\"wp-block-paragraph\">The QField serves to recreate the process of reinforcement learning with a piece of paper and a pen. This is about learning a way by using the unplugged algorithm. Unlike other algorithms that aim for optimal solutions, this approach focuses on learning through trial and feedback.<\/p>\n\n\n\n<!--more-->\n\n\n\n<h3 class=\"wp-block-heading\">Preparation<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Print out the picture or draw it (instead of Scratch you can also write 'Start' in the first field and instead of the pie 'Goal'). You also need a piece of play (as with humans, don't get angry).<\/p>\n\n\n\n<div class=\"wp-block-group is-content-justification-center is-nowrap is-layout-flex wp-container-core-group-is-layout-82936891 wp-block-group-is-layout-flex\">\n<figure class=\"wp-block-image size-large is-resized\"><a href=\"https:\/\/ffritze.de\/wp-content\/uploads\/2026\/02\/QFeld.png\" target=\"_blank\" rel=\" noreferrer noopener\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"827\" src=\"https:\/\/ffritze.de\/wp-content\/uploads\/2026\/02\/QFeld-1024x827.png\" alt=\"\" class=\"wp-image-479\" style=\"box-shadow:var(--wp--preset--shadow--natural);aspect-ratio:1.2382127869661812;width:448px;height:auto\" srcset=\"https:\/\/ffritze.de\/wp-content\/uploads\/2026\/02\/QFeld-1024x827.png 1024w, https:\/\/ffritze.de\/wp-content\/uploads\/2026\/02\/QFeld-300x242.png 300w, https:\/\/ffritze.de\/wp-content\/uploads\/2026\/02\/QFeld-768x620.png 768w, https:\/\/ffritze.de\/wp-content\/uploads\/2026\/02\/QFeld-1536x1241.png 1536w, https:\/\/ffritze.de\/wp-content\/uploads\/2026\/02\/QFeld.png 1644w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/figure>\n<\/div>\n\n\n\n<div class=\"wp-block-columns are-vertically-aligned-center is-layout-flex wp-container-core-columns-is-layout-baaca9a6 wp-block-columns-is-layout-flex\">\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\" style=\"flex-basis:60%\">\n<p class=\"wp-block-paragraph\">Scratch first moves randomly across the field.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The small scratch program shows you the direction in which you drag your figure (if possible), click on the green flag or directly on the arrow.<\/p>\n<\/div>\n\n\n\n<div class=\"wp-block-column is-vertically-aligned-center is-layout-flow wp-block-column-is-layout-flow\">\n<div class=\"responsive-iframe\">\n    <iframe src=\"https:\/\/scratch.mit.edu\/projects\/1282076771\/embed\" allowtransparency=\"true\" frameborder=\"0\" scrolling=\"no\" allowfullscreen><\/iframe>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Scratch learns<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Our agent Scratch initially has no way to know the next step, so he happens to run across the field.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">However, Scratch knows its goal. When he reaches his goal, he can draw an arrow in the direction of running into the field from which he came. At the next pass he is now a little bit smarter.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Of course, Scratch now knows where to go as soon as he hits an arrow. Therefore, as soon as he meets an arrow, he can draw an arrow in the direction of running into the field from which he came. He can then follow the arrow on which he stands.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For detailed instructions on developing with Scratch, check out the follow-up post\u00a0<a href=\"https:\/\/ffritze.de\/en\/q-learning-entwicklung-mit-scratch\/\" target=\"_blank\" rel=\"noreferrer noopener\">Q-Learning Entwicklung mit Scratch<\/a> an.<\/p>\n\n\n\n<div style=\"height:25px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-group has-custom-hellblau-transparent-background-color has-background has-global-padding is-layout-constrained wp-container-core-group-is-layout-c9a371c5 wp-block-group-is-layout-constrained\" style=\"border-width:1px;border-top-left-radius:25px;border-top-right-radius:25px;border-bottom-left-radius:25px;border-bottom-right-radius:25px;padding-top:var(--wp--preset--spacing--30);padding-right:var(--wp--preset--spacing--30);padding-bottom:var(--wp--preset--spacing--30);padding-left:var(--wp--preset--spacing--30);box-shadow:var(--wp--preset--shadow--natural)\">\n<h3 class=\"wp-block-heading\">Did you find a way?<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Look at the drawn article image above again. What could be wrong there?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You've probably noticed that Scratch wanted to run out of the field at the same place several times. How can you improve the process so that Scratch adjusts his behavior?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Make the field more interesting, build in obstacles or give the card a different shape.<\/p>\n<\/div>\n\n\n\n<div style=\"height:33px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<div class=\"wp-block-group has-custom-hellblau-transparent-background-color has-background has-global-padding is-layout-constrained wp-container-core-group-is-layout-c9a371c5 wp-block-group-is-layout-constrained\" style=\"border-width:1px;border-top-left-radius:25px;border-top-right-radius:25px;border-bottom-left-radius:25px;border-bottom-right-radius:25px;padding-top:var(--wp--preset--spacing--30);padding-right:var(--wp--preset--spacing--30);padding-bottom:var(--wp--preset--spacing--30);padding-left:var(--wp--preset--spacing--30);box-shadow:var(--wp--preset--shadow--natural)\">\n<p class=\"wp-block-paragraph\">Reinforcement learning is a learning approach in which an agent learns to improve his decisions through interactions with his environment. In this case, it is used to allow Scratch to learn how to reach his target through random movements and drawing arrows. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">What feedback does Scratch get from his environment to learn?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Where could reinforcement learning be used in other scenarios or applications?<\/p>\n<\/div>\n\n\n\n<div style=\"height:33px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>","protected":false},"excerpt":{"rendered":"<p>The QField serves to recreate the process of reinforcement learning with a piece of paper and a pen. This is about learning a way by using the unplugged algorithm. Unlike other algorithms that aim for optimal solutions, this approach focuses on learning through trial and feedback.<\/p>","protected":false},"author":2,"featured_media":439,"comment_status":"open","ping_status":"open","sticky":false,"template":"seite-thoughts-beitrag","format":"standard","meta":{"footnotes":""},"categories":[19,34],"tags":[36,37],"class_list":["post-438","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-algorithms","category-unplugged","tag-maschinelles-lernen","tag-reinforcement-learning"],"_links":{"self":[{"href":"https:\/\/ffritze.de\/en\/wp-json\/wp\/v2\/posts\/438","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ffritze.de\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ffritze.de\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ffritze.de\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/ffritze.de\/en\/wp-json\/wp\/v2\/comments?post=438"}],"version-history":[{"count":0,"href":"https:\/\/ffritze.de\/en\/wp-json\/wp\/v2\/posts\/438\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ffritze.de\/en\/wp-json\/wp\/v2\/media\/439"}],"wp:attachment":[{"href":"https:\/\/ffritze.de\/en\/wp-json\/wp\/v2\/media?parent=438"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ffritze.de\/en\/wp-json\/wp\/v2\/categories?post=438"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ffritze.de\/en\/wp-json\/wp\/v2\/tags?post=438"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}