Reward is enough - Journal of Artificial Intelligence by remlaps-lite

View this thread on steempeak.com
· @remlaps-lite ·
$0.38
Reward is enough - Journal of Artificial Intelligence
<div class=pull-right>

[![](https://ars.els-cdn.com/content/image/1-s2.0-S0004370221X00057-cov150h.gif)](https://www.sciencedirect.com/science/article/pii/S0004370221000862)

</div>

<h6><sup>( May 24, 2021; <i>Journal of Artificial Intelligence</i> )</sup></h6>

<blockquote>

<b>Abstract</b>

In this article we hypothesise that intelligence, and its associated abilities, can be understood as subserving the maximisation of reward. Accordingly, reward is enough to drive behaviour that exhibits abilities studied in natural and artificial intelligence, including knowledge, learning, perception, social intelligence, language, generalisation and imitation. This is in contrast to the view that specialised problem formulations are needed for each ability, based on other signals or objectives. Furthermore, we suggest that agents that learn through trial and error experience to maximise reward could learn behaviour that exhibits most if not all of these abilities, and therefore that powerful reinforcement learning agents could constitute a solution to artificial general intelligence.

</blockquote>

Read the rest from <i>Journal of Artificial Intelligence</i>: [Reward is enough](https://www.sciencedirect.com/science/article/pii/S0004370221000862)

- [PDF](https://www.sciencedirect.com/science/article/pii/S0004370221000862/pdfft?md5=12802032b840c6cc044e57c3a5aaa7c3&pid=1-s2.0-S0004370221000862-main.pdf)

---

-h/t [Communications of the ACM](https://cacm.acm.org/opinion/articles/253154-reward-is-enough-for-generalized-ai/fulltext)
👍  , , , , , , , , , , , , , , , , , , , ,
👎  , ,
properties (23)
post_id91,960,243
authorremlaps-lite
permlinkreward-is-enough-journal-of-artificial-intelligence
categoryhive-160342
json_metadata{"tags":["steemlinks","technology","artificial-intelligence","penny4thoughts"],"image":["https:\/\/ars.els-cdn.com\/content\/image\/1-s2.0-S0004370221X00057-cov150h.gif"],"links":["https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0004370221000862","https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0004370221000862\/pdfft?md5=12802032b840c6cc044e57c3a5aaa7c3&pid=1-s2.0-S0004370221000862-main.pdf","https:\/\/cacm.acm.org\/opinion\/articles\/253154-reward-is-enough-for-generalized-ai\/fulltext"],"app":"steemit\/0.2","format":"markdown"}
created2021-06-14 23:49:18
last_update2021-06-14 23:49:18
depth0
children6
net_rshares1,896,917,936,728
last_payout2021-06-21 23:49:18
cashout_time1969-12-31 23:59:59
total_payout_value0.000 SBD
curator_payout_value0.383 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length1,547
author_reputation538,407,512,576,073
root_title"Reward is enough - Journal of Artificial Intelligence"
beneficiaries
0.
accountpenny4thoughts
weight10,000
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (24)
@ruzmaira ·
$0.33
Umm .. Interesting articulates they are practically looking for a way for artificial intelligence to have its own thoughts that can manipulate any object, remember where it has left something, know how to choose between good and bad.

Having facial expressions depending on how you feel Umm I think these would be a double-edged sword in the future as we have talked about before.

Do you think that an artificial intelligence can develop some kind of sensation through reward stimulation?
👍  , , , ,
properties (23)
post_id91,964,162
authorruzmaira
permlinkquq4t6
categoryhive-160342
json_metadata{"app":"steemit\/0.2"}
created2021-06-15 03:39:09
last_update2021-06-15 03:39:09
depth1
children1
net_rshares871,718,828,754
last_payout2021-06-22 03:39:09
cashout_time1969-12-31 23:59:59
total_payout_value0.164 SBD
curator_payout_value0.163 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length489
author_reputation17,069,555,901,365
root_title"Reward is enough - Journal of Artificial Intelligence"
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (5)
@remlaps ·
> Do you think that an artificial intelligence can develop some kind of sensation through reward stimulation?

Yeah, I do think so.  I remember reading about [this](https://ai.facebook.com/blog/near-perfect-point-goal-navigation-from-25-billion-frames-of-experience/) last year.

> The AI community has a long-term goal of building intelligent machines that interact effectively with the physical world, and a key challenge is teaching these systems to navigate through complex, unfamiliar real-world environments to reach a specified destination — without a preprovided map. We are announcing today that Facebook AI has created a new large-scale distributed reinforcement learning (RL) algorithm called DD-PPO, which has effectively solved the task of point-goal navigation using only an RGB-D camera, GPS, and compass data. Agents trained with DD-PPO (which stands for decentralized distributed proximal policy optimization) achieve nearly 100 percent success in a variety of virtual environments, such as houses and office buildings. We have also successfully tested our model with tasks in real-world physical settings using a LoCoBot and Facebook AI’s <A HREF="https://ai.facebook.com/blog/open-sourcing-pyrobot-to-accelerate-ai-robotics-research/">PyRobot platform</A>.

When they talk about "<i>reinforcement learning</i>", that's a reward-based learning model.
👍  
properties (23)
post_id92,011,176
authorremlaps
permlinkqutjc8
categoryhive-160342
json_metadata{"links":["https:\/\/ai.facebook.com\/blog\/near-perfect-point-goal-navigation-from-25-billion-frames-of-experience\/"],"app":"steemit\/0.2"}
created2021-06-16 23:45:48
last_update2021-06-16 23:45:48
depth2
children0
net_rshares39,276,143,041
last_payout2021-06-23 23:45:48
cashout_time1969-12-31 23:59:59
total_payout_value0.000 SBD
curator_payout_value0.000 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length1,368
author_reputation284,737,353,688,347
root_title"Reward is enough - Journal of Artificial Intelligence"
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (1)
@tanveer741 ·
$0.27
To understand the research paper, I found this YouTube video by Yannic Kilcher, a Machine Learning expert.

https://www.youtube.com/watch?v=dmH1ZpcROMk
👍  , , , ,
properties (23)
post_id91,970,602
authortanveer741
permlinkquqmy3
categoryhive-160342
json_metadata{"image":["https:\/\/img.youtube.com\/vi\/dmH1ZpcROMk\/0.jpg"],"links":["https:\/\/www.youtube.com\/watch?v=dmH1ZpcROMk"],"app":"steemit\/0.2"}
created2021-06-15 10:10:54
last_update2021-06-15 10:10:54
depth1
children1
net_rshares747,287,173,039
last_payout2021-06-22 10:10:54
cashout_time1969-12-31 23:59:59
total_payout_value0.136 SBD
curator_payout_value0.136 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length151
author_reputation57,987,359,510,083
root_title"Reward is enough - Journal of Artificial Intelligence"
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (5)
@remlaps-lite ·
Very cool video!  So far, I have only had time to skim the article, but I hope to read it later.  Meanwhile, I am listening to the video now.  He doesn't just explain it, but also presents some counterarguments and criticisms. Thank you very much.
properties (22)
post_id92,011,090
authorremlaps-lite
permlinkqutj19
categoryhive-160342
json_metadata{"app":"steemit\/0.2"}
created2021-06-16 23:39:09
last_update2021-06-16 23:39:09
depth2
children0
net_rshares0
last_payout2021-06-23 23:39:09
cashout_time1969-12-31 23:59:59
total_payout_value0.000 SBD
curator_payout_value0.000 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length247
author_reputation538,407,512,576,073
root_title"Reward is enough - Journal of Artificial Intelligence"
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
@primevaldad ·
$0.22
> Sophisticated abilities may arise from the maximisation of simple rewards in complex environments. 

There's a [You're Wrong About](https://www.buzzsprout.com/1112270/4446851-koko-the-gorilla) podcast episode that essentially makes the case that in order to communicate with sign language, Koko the gorilla had simply learned gestures for rewards with a more sophisticated framework than what we typically see in research. Obviously, that's an oversimplification, but there's a wonderful debate about the whole thing. 

>According to our hypothesis, the ability of language in its full richness, including all of these broader abilities, arises from the pursuit of reward. It is an instance of an agent's ability to produce complex sequences of actions (e.g. uttering sentences) based on complex sequences of observations (e.g. receiving sentences) in order to influence other agents in the environment (cf. discussion of social intelligence above) and accumulate greater reward [7].

If reward is enough, and seeking reward is a singular universal mechanism for the development of intelligence, it would seem that either side of the Koko argument is moot. The "Koko was only responding to rewards" camp is in fact just echoing the sentiment that Koko is demonstrating general intelligence.  Therefore, that cannot by itself stand as an argument that Koko had not demonstrated general intelligence. In contrast, arguing that Koko did demonstrate a high level of intelligence would simply be reiterating the counter argument that Koko manifested sophisticated abilities through the maximization of rewards in complex environments.
👍  , , ,
properties (23)
post_id92,008,542
authorprimevaldad
permlinkqutbct
categoryhive-160342
json_metadata{"links":["https:\/\/www.buzzsprout.com\/1112270\/4446851-koko-the-gorilla"],"app":"steemit\/0.2"}
created2021-06-16 20:53:18
last_update2021-06-16 20:53:18
depth1
children1
net_rshares813,569,196,899
last_payout2021-06-23 20:53:18
cashout_time1969-12-31 23:59:59
total_payout_value0.112 SBD
curator_payout_value0.112 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length1,631
author_reputation7,281,522,942,561
root_title"Reward is enough - Journal of Artificial Intelligence"
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (4)
@remlaps ·
> If reward is enough, and seeking reward is a singular universal mechanism for the development of intelligence, it would seem that either side of the Koko argument is moot.

Very interesting point.  And this mirrors the question of free will and whether human intelligence is really anything more than just a biological form of computation -- i.e. <A HREF="https://youtu.be/C5DfnIjZPGw">Chalmers' <i>hard problem of consciousness</i></A>.
👍  
properties (23)
post_id92,011,284
authorremlaps
permlinkqutjo6
categoryhive-160342
json_metadata{"app":"steemit\/0.2"}
created2021-06-16 23:52:54
last_update2021-06-16 23:52:54
depth2
children0
net_rshares16,609,737,144
last_payout2021-06-23 23:52:54
cashout_time1969-12-31 23:59:59
total_payout_value0.000 SBD
curator_payout_value0.000 SBD
pending_payout_value0.000 SBD
promoted0.000 SBD
body_length439
author_reputation284,737,353,688,347
root_title"Reward is enough - Journal of Artificial Intelligence"
beneficiaries[]
max_accepted_payout1,000,000.000 SBD
percent_steem_dollars10,000
author_curate_reward""
vote details (1)