VibeBuilders.ai Logo
VibeBuilders.ai

All Resources

[D] Gary Marcus and Luis Lamb -- discussion of AGI and Neurosymbolic methods
reddit
LLM Vibe Score0
Human Vibe Score1
timscarfeThis week

[D] Gary Marcus and Luis Lamb -- discussion of AGI and Neurosymbolic methods

https://youtu.be/nhUt6mKCPf8 Pod: https://anchor.fm/machinelearningstreettalk/episodes/54-Gary-Marcus-and-Luis-Lamb---Neurosymbolic-models-e125495 Professor Gary Marcus is a scientist, best-selling author, and entrepreneur. He is Founder and CEO of Robust.AI, and was Founder and CEO of Geometric Intelligence, a machine learning company acquired by Uber in 2016. Gary said in his recent next decade paper that — without us, or other creatures like us, the world would continue to exist, but it would not be described, distilled, or understood. Human lives are filled with abstraction and causal description. This is so powerful. Francois Chollet the other week said that intelligence is literally sensitivity to abstract analogies, and that is all there is to it. It's almost as if one of the most important features of intelligence is to be able to abstract knowledge, this drives the generalisation which will allow you to mine previous experience to make sense of many future novel situations. Also joining us today is Professor Luis Lamb — Secretary of Innovation for Science and Technology of the State of Rio Grande do Sul, Brazil. His Research Interests are Machine Learning and Reasoning, Neuro-Symbolic Computing, Logic in Computation and Artificial Intelligence, Cognitive and Neural Computation and also AI Ethics and Social Computing. Luis released his new paper Neurosymbolic AI: the third wave at the end of last year. It beautifully articulated the key ingredients needed in the next generation of AI systems, integrating type 1 and type 2 approaches to AI and it summarises all the of the achievements of the last 20 years of research. We cover a lot of ground in today's show. Explaining the limitations of deep learning, Rich Sutton's the bitter lesson and "reward is enough", and the semantic foundation which is required for us to build robust AI.

Interview with Juergen Schmidhuber, renowned ‘Father Of Modern AI’, says his life’s work won't lead to dystopia.
reddit
LLM Vibe Score0
Human Vibe Score0.765
hardmaruThis week

Interview with Juergen Schmidhuber, renowned ‘Father Of Modern AI’, says his life’s work won't lead to dystopia.

Schmidhuber interview expressing his views on the future of AI and AGI. Original source. I think the interview is of interest to r/MachineLearning, and presents an alternate view, compared to other influential leaders in AI. Juergen Schmidhuber, Renowned 'Father Of Modern AI,' Says His Life’s Work Won't Lead To Dystopia May 23, 2023. Contributed by Hessie Jones. Amid the growing concern about the impact of more advanced artificial intelligence (AI) technologies on society, there are many in the technology community who fear the implications of the advancements in Generative AI if they go unchecked. Dr. Juergen Schmidhuber, a renowned scientist, artificial intelligence researcher and widely regarded as one of the pioneers in the field, is more optimistic. He declares that many of those who suddenly warn against the dangers of AI are just seeking publicity, exploiting the media’s obsession with killer robots which has attracted more attention than “good AI” for healthcare etc. The potential to revolutionize various industries and improve our lives is clear, as are the equal dangers if bad actors leverage the technology for personal gain. Are we headed towards a dystopian future, or is there reason to be optimistic? I had a chance to sit down with Dr. Juergen Schmidhuber to understand his perspective on this seemingly fast-moving AI-train that will leap us into the future. As a teenager in the 1970s, Juergen Schmidhuber became fascinated with the idea of creating intelligent machines that could learn and improve on their own, becoming smarter than himself within his lifetime. This would ultimately lead to his groundbreaking work in the field of deep learning. In the 1980s, he studied computer science at the Technical University of Munich (TUM), where he earned his diploma in 1987. His thesis was on the ultimate self-improving machines that, not only, learn through some pre-wired human-designed learning algorithm, but also learn and improve the learning algorithm itself. Decades later, this became a hot topic. He also received his Ph.D. at TUM in 1991 for work that laid some of the foundations of modern AI. Schmidhuber is best known for his contributions to the development of recurrent neural networks (RNNs), the most powerful type of artificial neural network that can process sequential data such as speech and natural language. With his students Sepp Hochreiter, Felix Gers, Alex Graves, Daan Wierstra, and others, he published architectures and training algorithms for the long short-term memory (LSTM), a type of RNN that is widely used in natural language processing, speech recognition, video games, robotics, and other applications. LSTM has become the most cited neural network of the 20th century, and Business Week called it "arguably the most commercial AI achievement." Throughout his career, Schmidhuber has received various awards and accolades for his groundbreaking work. In 2013, he was awarded the Helmholtz Prize, which recognizes significant contributions to the field of machine learning. In 2016, he was awarded the IEEE Neural Network Pioneer Award for "pioneering contributions to deep learning and neural networks." The media have often called him the “father of modern AI,” because the most cited neural networks all build on his lab’s work. He is quick to point out, however, that AI history goes back centuries. Despite his many accomplishments, at the age of 60, he feels mounting time pressure towards building an Artificial General Intelligence within his lifetime and remains committed to pushing the boundaries of AI research and development. He is currently director of the KAUST AI Initiative, scientific director of the Swiss AI Lab IDSIA, and co-founder and chief scientist of AI company NNAISENSE, whose motto is "AI∀" which is a math-inspired way of saying "AI For All." He continues to work on cutting-edge AI technologies and applications to improve human health and extend human lives and make lives easier for everyone. The following interview has been edited for clarity. Jones: Thank you Juergen for joining me. You have signed letters warning about AI weapons. But you didn't sign the recent publication, "Pause Gigantic AI Experiments: An Open Letter"? Is there a reason? Schmidhuber: Thank you Hessie. Glad to speak with you. I have realized that many of those who warn in public against the dangers of AI are just seeking publicity. I don't think the latest letter will have any significant impact because many AI researchers, companies, and governments will ignore it completely. The proposal frequently uses the word "we" and refers to "us," the humans. But as I have pointed out many times in the past, there is no "we" that everyone can identify with. Ask 10 different people, and you will hear 10 different opinions about what is "good." Some of those opinions will be completely incompatible with each other. Don't forget the enormous amount of conflict between the many people. The letter also says, "If such a pause cannot be quickly put in place, governments should intervene and impose a moratorium." The problem is that different governments have ALSO different opinions about what is good for them and for others. Great Power A will say, if we don't do it, Great Power B will, perhaps secretly, and gain an advantage over us. The same is true for Great Powers C and D. Jones: Everyone acknowledges this fear surrounding current generative AI technology. Moreover, the existential threat of this technology has been publicly acknowledged by Sam Altman, CEO of OpenAI himself, calling for AI regulation. From your perspective, is there an existential threat? Schmidhuber: It is true that AI can be weaponized, and I have no doubt that there will be all kinds of AI arms races, but AI does not introduce a new quality of existential threat. The threat coming from AI weapons seems to pale in comparison to the much older threat from nuclear hydrogen bombs that don’t need AI at all. We should be much more afraid of half-century-old tech in the form of H-bomb rockets. The Tsar Bomba of 1961 had almost 15 times more destructive power than all weapons of WW-II combined. Despite the dramatic nuclear disarmament since the 1980s, there are still more than enough nuclear warheads to wipe out human civilization within two hours, without any AI I’m much more worried about that old existential threat than the rather harmless AI weapons. Jones: I realize that while you compare AI to the threat of nuclear bombs, there is a current danger that a current technology can be put in the hands of humans and enable them to “eventually” exact further harms to individuals of group in a very precise way, like targeted drone attacks. You are giving people a toolset that they've never had before, enabling bad actors, as some have pointed out, to be able to do a lot more than previously because they didn't have this technology. Schmidhuber: Now, all that sounds horrible in principle, but our existing laws are sufficient to deal with these new types of weapons enabled by AI. If you kill someone with a gun, you will go to jail. Same if you kill someone with one of these drones. Law enforcement will get better at understanding new threats and new weapons and will respond with better technology to combat these threats. Enabling drones to target persons from a distance in a way that requires some tracking and some intelligence to perform, which has traditionally been performed by skilled humans, to me, it seems is just an improved version of a traditional weapon, like a gun, which is, you know, a little bit smarter than the old guns. But, in principle, all of that is not a new development. For many centuries, we have had the evolution of better weaponry and deadlier poisons and so on, and law enforcement has evolved their policies to react to these threats over time. So, it's not that we suddenly have a new quality of existential threat and it's much more worrisome than what we have had for about six decades. A large nuclear warhead doesn’t need fancy face recognition to kill an individual. No, it simply wipes out an entire city with ten million inhabitants. Jones: The existential threat that’s implied is the extent to which humans have control over this technology. We see some early cases of opportunism which, as you say, tends to get more media attention than positive breakthroughs. But you’re implying that this will all balance out? Schmidhuber: Historically, we have a long tradition of technological breakthroughs that led to advancements in weapons for the purpose of defense but also for protection. From sticks, to rocks, to axes to gunpowder to cannons to rockets… and now to drones… this has had a drastic influence on human history but what has been consistent throughout history is that those who are using technology to achieve their own ends are themselves, facing the same technology because the opposing side is learning to use it against them. And that's what has been repeated in thousands of years of human history and it will continue. I don't see the new AI arms race as something that is remotely as existential a threat as the good old nuclear warheads. You said something important, in that some people prefer to talk about the downsides rather than the benefits of this technology, but that's misleading, because 95% of all AI research and AI development is about making people happier and advancing human life and health. Jones: Let’s touch on some of those beneficial advances in AI research that have been able to radically change present day methods and achieve breakthroughs. Schmidhuber: All right! For example, eleven years ago, our team with my postdoc Dan Ciresan was the first to win a medical imaging competition through deep learning. We analyzed female breast cells with the objective to determine harmless cells vs. those in the pre-cancer stage. Typically, a trained oncologist needs a long time to make these determinations. Our team, who knew nothing about cancer, were able to train an artificial neural network, which was totally dumb in the beginning, on lots of this kind of data. It was able to outperform all the other methods. Today, this is being used not only for breast cancer, but also for radiology and detecting plaque in arteries, and many other things. Some of the neural networks that we have developed in the last 3 decades are now prevalent across thousands of healthcare applications, detecting Diabetes and Covid-19 and what not. This will eventually permeate across all healthcare. The good consequences of this type of AI are much more important than the click-bait new ways of conducting crimes with AI. Jones: Adoption is a product of reinforced outcomes. The massive scale of adoption either leads us to believe that people have been led astray, or conversely, technology is having a positive effect on people’s lives. Schmidhuber: The latter is the likely case. There's intense commercial pressure towards good AI rather than bad AI because companies want to sell you something, and you are going to buy only stuff you think is going to be good for you. So already just through this simple, commercial pressure, you have a tremendous bias towards good AI rather than bad AI. However, doomsday scenarios like in Schwarzenegger movies grab more attention than documentaries on AI that improve people’s lives. Jones: I would argue that people are drawn to good stories – narratives that contain an adversary and struggle, but in the end, have happy endings. And this is consistent with your comment on human nature and how history, despite its tendency for violence and destruction of humanity, somehow tends to correct itself. Let’s take the example of a technology, which you are aware – GANs – General Adversarial Networks, which today has been used in applications for fake news and disinformation. In actuality, the purpose in the invention of GANs was far from what it is used for today. Schmidhuber: Yes, the name GANs was created in 2014 but we had the basic principle already in the early 1990s. More than 30 years ago, I called it artificial curiosity. It's a very simple way of injecting creativity into a little two network system. This creative AI is not just trying to slavishly imitate humans. Rather, it’s inventing its own goals. Let me explain: You have two networks. One network is producing outputs that could be anything, any action. Then the second network is looking at these actions and it’s trying to predict the consequences of these actions. An action could move a robot, then something happens, and the other network is just trying to predict what will happen. Now we can implement artificial curiosity by reducing the prediction error of the second network, which, at the same time, is the reward of the first network. The first network wants to maximize its reward and so it will invent actions that will lead to situations that will surprise the second network, which it has not yet learned to predict well. In the case where the outputs are fake images, the first network will try to generate images that are good enough to fool the second network, which will attempt to predict the reaction of the environment: fake or real image, and it will try to become better at it. The first network will continue to also improve at generating images whose type the second network will not be able to predict. So, they fight each other. The 2nd network will continue to reduce its prediction error, while the 1st network will attempt to maximize it. Through this zero-sum game the first network gets better and better at producing these convincing fake outputs which look almost realistic. So, once you have an interesting set of images by Vincent Van Gogh, you can generate new images that leverage his style, without the original artist having ever produced the artwork himself. Jones: I see how the Van Gogh example can be applied in an education setting and there are countless examples of artists mimicking styles from famous painters but image generation from this instance that can happen within seconds is quite another feat. And you know this is how GANs has been used. What’s more prevalent today is a socialized enablement of generating images or information to intentionally fool people. It also surfaces new harms that deal with the threat to intellectual property and copyright, where laws have yet to account for. And from your perspective this was not the intention when the model was conceived. What was your motivation in your early conception of what is now GANs? Schmidhuber: My old motivation for GANs was actually very important and it was not to create deepfakes or fake news but to enable AIs to be curious and invent their own goals, to make them explore their environment and make them creative. Suppose you have a robot that executes one action, then something happens, then it executes another action, and so on, because it wants to achieve certain goals in the environment. For example, when the battery is low, this will trigger “pain” through hunger sensors, so it wants to go to the charging station, without running into obstacles, which will trigger other pain sensors. It will seek to minimize pain (encoded through numbers). Now the robot has a friend, the second network, which is a world model ––it’s a prediction machine that learns to predict the consequences of the robot’s actions. Once the robot has a good model of the world, it can use it for planning. It can be used as a simulation of the real world. And then it can determine what is a good action sequence. If the robot imagines this sequence of actions, the model will predict a lot of pain, which it wants to avoid. If it plays this alternative action sequence in its mental model of the world, then it will predict a rewarding situation where it’s going to sit on the charging station and its battery is going to load again. So, it'll prefer to execute the latter action sequence. In the beginning, however, the model of the world knows nothing, so how can we motivate the first network to generate experiments that lead to data that helps the world model learn something it didn’t already know? That’s what artificial curiosity is about. The dueling two network systems effectively explore uncharted environments by creating experiments so that over time the curious AI gets a better sense of how the environment works. This can be applied to all kinds of environments, and has medical applications. Jones: Let’s talk about the future. You have said, “Traditional humans won’t play a significant role in spreading intelligence across the universe.” Schmidhuber: Let’s first conceptually separate two types of AIs. The first type of AI are tools directed by humans. They are trained to do specific things like accurately detect diabetes or heart disease and prevent attacks before they happen. In these cases, the goal is coming from the human. More interesting AIs are setting their own goals. They are inventing their own experiments and learning from them. Their horizons expand and eventually they become more and more general problem solvers in the real world. They are not controlled by their parents, but much of what they learn is through self-invented experiments. A robot, for example, is rotating a toy, and as it is doing this, the video coming in through the camera eyes, changes over time and it begins to learn how this video changes and learns how the 3D nature of the toy generates certain videos if you rotate it a certain way, and eventually, how gravity works, and how the physics of the world works. Like a little scientist! And I have predicted for decades that future scaled-up versions of such AI scientists will want to further expand their horizons, and eventually go where most of the physical resources are, to build more and bigger AIs. And of course, almost all of these resources are far away from earth out there in space, which is hostile to humans but friendly to appropriately designed AI-controlled robots and self-replicating robot factories. So here we are not talking any longer about our tiny biosphere; no, we are talking about the much bigger rest of the universe. Within a few tens of billions of years, curious self-improving AIs will colonize the visible cosmos in a way that’s infeasible for humans. Those who don’t won’t have an impact. Sounds like science fiction, but since the 1970s I have been unable to see a plausible alternative to this scenario, except for a global catastrophe such as an all-out nuclear war that stops this development before it takes off. Jones: How long have these AIs, which can set their own goals — how long have they existed? To what extent can they be independent of human interaction? Schmidhuber: Neural networks like that have existed for over 30 years. My first simple adversarial neural network system of this kind is the one from 1990 described above. You don’t need a teacher there; it's just a little agent running around in the world and trying to invent new experiments that surprise its own prediction machine. Once it has figured out certain parts of the world, the agent will become bored and will move on to more exciting experiments. The simple 1990 systems I mentioned have certain limitations, but in the past three decades, we have also built more sophisticated systems that are setting their own goals and such systems I think will be essential for achieving true intelligence. If you are only imitating humans, you will never go beyond them. So, you really must give AIs the freedom to explore previously unexplored regions of the world in a way that no human is really predefining. Jones: Where is this being done today? Schmidhuber: Variants of neural network-based artificial curiosity are used today for agents that learn to play video games in a human-competitive way. We have also started to use them for automatic design of experiments in fields such as materials science. I bet many other fields will be affected by it: chemistry, biology, drug design, you name it. However, at least for now, these artificial scientists, as I like to call them, cannot yet compete with human scientists. I don’t think it’s going to stay this way but, at the moment, it’s still the case. Sure, AI has made a lot of progress. Since 1997, there have been superhuman chess players, and since 2011, through the DanNet of my team, there have been superhuman visual pattern recognizers. But there are other things where humans, at the moment at least, are much better, in particular, science itself. In the lab we have many first examples of self-directed artificial scientists, but they are not yet convincing enough to appear on the radar screen of the public space, which is currently much more fascinated with simpler systems that just imitate humans and write texts based on previously seen human-written documents. Jones: You speak of these numerous instances dating back 30 years of these lab experiments where these self-driven agents are deciding and learning and moving on once they’ve learned. And I assume that that rate of learning becomes even faster over time. What kind of timeframe are we talking about when this eventually is taken outside of the lab and embedded into society? Schmidhuber: This could still take months or even years :-) Anyway, in the not-too-distant future, we will probably see artificial scientists who are good at devising experiments that allow them to discover new, previously unknown physical laws. As always, we are going to profit from the old trend that has held at least since 1941: every decade compute is getting 100 times cheaper. Jones: How does this trend affect modern AI such as ChatGPT? Schmidhuber: Perhaps you know that all the recent famous AI applications such as ChatGPT and similar models are largely based on principles of artificial neural networks invented in the previous millennium. The main reason why they works so well now is the incredible acceleration of compute per dollar. ChatGPT is driven by a neural network called “Transformer” described in 2017 by Google. I am happy about that because a quarter century earlier in 1991 I had a particular Transformer variant which is now called the “Transformer with linearized self-attention”. Back then, not much could be done with it, because the compute cost was a million times higher than today. But today, one can train such models on half the internet and achieve much more interesting results. Jones: And for how long will this acceleration continue? Schmidhuber: There's no reason to believe that in the next 30 years, we won't have another factor of 1 million and that's going to be really significant. In the near future, for the first time we will have many not-so expensive devices that can compute as much as a human brain. The physical limits of computation, however, are much further out so even if the trend of a factor of 100 every decade continues, the physical limits (of 1051 elementary instructions per second and kilogram of matter) won’t be hit until, say, the mid-next century. Even in our current century, however, we’ll probably have many machines that compute more than all 10 billion human brains collectively and you can imagine, everything will change then! Jones: That is the big question. Is everything going to change? If so, what do you say to the next generation of leaders, currently coming out of college and university. So much of this change is already impacting how they study, how they will work, or how the future of work and livelihood is defined. What is their purpose and how do we change our systems so they will adapt to this new version of intelligence? Schmidhuber: For decades, people have asked me questions like that, because you know what I'm saying now, I have basically said since the 1970s, it’s just that today, people are paying more attention because, back then, they thought this was science fiction. They didn't think that I would ever come close to achieving my crazy life goal of building a machine that learns to become smarter than myself such that I can retire. But now many have changed their minds and think it's conceivable. And now I have two daughters, 23 and 25. People ask me: what do I tell them? They know that Daddy always said, “It seems likely that within your lifetimes, you will have new types of intelligence that are probably going to be superior in many ways, and probably all kinds of interesting ways.” How should they prepare for that? And I kept telling them the obvious: Learn how to learn new things! It's not like in the previous millennium where within 20 years someone learned to be a useful member of society, and then took a job for 40 years and performed in this job until she received her pension. Now things are changing much faster and we must learn continuously just to keep up. I also told my girls that no matter how smart AIs are going to get, learn at least the basics of math and physics, because that’s the essence of our universe, and anybody who understands this will have an advantage, and learn all kinds of new things more easily. I also told them that social skills will remain important, because most future jobs for humans will continue to involve interactions with other humans, but I couldn’t teach them anything about that; they know much more about social skills than I do. You touched on the big philosophical question about people’s purpose. Can this be answered without answering the even grander question: What’s the purpose of the entire universe? We don’t know. But what’s happening right now might be connected to the unknown answer. Don’t think of humans as the crown of creation. Instead view human civilization as part of a much grander scheme, an important step (but not the last one) on the path of the universe from very simple initial conditions towards more and more unfathomable complexity. Now it seems ready to take its next step, a step comparable to the invention of life itself over 3.5 billion years ago. Alas, don’t worry, in the end, all will be good! Jones: Let’s get back to this transformation happening right now with OpenAI. There are many questioning the efficacy and accuracy of ChatGPT, and are concerned its release has been premature. In light of the rampant adoption, educators have banned its use over concerns of plagiarism and how it stifles individual development. Should large language models like ChatGPT be used in school? Schmidhuber: When the calculator was first introduced, instructors forbade students from using it in school. Today, the consensus is that kids should learn the basic methods of arithmetic, but they should also learn to use the “artificial multipliers” aka calculators, even in exams, because laziness and efficiency is a hallmark of intelligence. Any intelligent being wants to minimize its efforts to achieve things. And that's the reason why we have tools, and why our kids are learning to use these tools. The first stone tools were invented maybe 3.5 million years ago; tools just have become more sophisticated over time. In fact, humans have changed in response to the properties of their tools. Our anatomical evolution was shaped by tools such as spears and fire. So, it's going to continue this way. And there is no permanent way of preventing large language models from being used in school. Jones: And when our children, your children graduate, what does their future work look like? Schmidhuber: A single human trying to predict details of how 10 billion people and their machines will evolve in the future is like a single neuron in my brain trying to predict what the entire brain and its tens of billions of neurons will do next year. 40 years ago, before the WWW was created at CERN in Switzerland, who would have predicted all those young people making money as YouTube video bloggers? Nevertheless, let’s make a few limited job-related observations. For a long time, people have thought that desktop jobs may require more intelligence than skills trade or handicraft professions. But now, it turns out that it's much easier to replace certain aspects of desktop jobs than replacing a carpenter, for example. Because everything that works well in AI is happening behind the screen currently, but not so much in the physical world. There are now artificial systems that can read lots of documents and then make really nice summaries of these documents. That is a desktop job. Or you give them a description of an illustration that you want to have for your article and pretty good illustrations are being generated that may need some minimal fine-tuning. But you know, all these desktop jobs are much easier to facilitate than the real tough jobs in the physical world. And it's interesting that the things people thought required intelligence, like playing chess, or writing or summarizing documents, are much easier for machines than they thought. But for things like playing football or soccer, there is no physical robot that can remotely compete with the abilities of a little boy with these skills. So, AI in the physical world, interestingly, is much harder than AI behind the screen in virtual worlds. And it's really exciting, in my opinion, to see that jobs such as plumbers are much more challenging than playing chess or writing another tabloid story. Jones: The way data has been collected in these large language models does not guarantee personal information has not been excluded. Current consent laws already are outdated when it comes to these large language models (LLM). The concern, rightly so, is increasing surveillance and loss of privacy. What is your view on this? Schmidhuber: As I have indicated earlier: are surveillance and loss of privacy inevitable consequences of increasingly complex societies? Super-organisms such as cities and states and companies consist of numerous people, just like people consist of numerous cells. These cells enjoy little privacy. They are constantly monitored by specialized "police cells" and "border guard cells": Are you a cancer cell? Are you an external intruder, a pathogen? Individual cells sacrifice their freedom for the benefits of being part of a multicellular organism. Similarly, for super-organisms such as nations. Over 5000 years ago, writing enabled recorded history and thus became its inaugural and most important invention. Its initial purpose, however, was to facilitate surveillance, to track citizens and their tax payments. The more complex a super-organism, the more comprehensive its collection of information about its constituents. 200 years ago, at least, the parish priest in each village knew everything about all the village people, even about those who did not confess, because they appeared in the confessions of others. Also, everyone soon knew about the stranger who had entered the village, because some occasionally peered out of the window, and what they saw got around. Such control mechanisms were temporarily lost through anonymization in rapidly growing cities but are now returning with the help of new surveillance devices such as smartphones as part of digital nervous systems that tell companies and governments a lot about billions of users. Cameras and drones etc. are becoming increasingly tinier and more ubiquitous. More effective recognition of faces and other detection technology are becoming cheaper and cheaper, and many will use it to identify others anywhere on earth; the big wide world will not offer any more privacy than the local village. Is this good or bad? Some nations may find it easier than others to justify more complex kinds of super-organisms at the expense of the privacy rights of their constituents. Jones: So, there is no way to stop or change this process of collection, or how it continuously informs decisions over time? How do you see governance and rules responding to this, especially amid Italy’s ban on ChatGPT following suspected user data breach and the more recent news about the Meta’s record $1.3billion fine in the company’s handling of user information? Schmidhuber: Data collection has benefits and drawbacks, such as the loss of privacy. How to balance those? I have argued for addressing this through data ownership in data markets. If it is true that data is the new oil, then it should have a price, just like oil. At the moment, the major surveillance platforms such as Meta do not offer users any money for their data and the transitive loss of privacy. In the future, however, we will likely see attempts at creating efficient data markets to figure out the data's true financial value through the interplay between supply and demand. Even some of the sensitive medical data should not be priced by governmental regulators but by patients (and healthy persons) who own it and who may sell or license parts thereof as micro-entrepreneurs in a healthcare data market. Following a previous interview, I gave for one of the largest re-insurance companies , let's look at the different participants in such a data market: patients, hospitals, data companies. (1) Patients with a rare form of cancer can offer more valuable data than patients with a very common form of cancer. (2) Hospitals and their machines are needed to extract the data, e.g., through magnet spin tomography, radiology, evaluations through human doctors, and so on. (3) Companies such as Siemens, Google or IBM would like to buy annotated data to make better artificial neural networks that learn to predict pathologies and diseases and the consequences of therapies. Now the market’s invisible hand will decide about the data’s price through the interplay between demand and supply. On the demand side, you will have several companies offering something for the data, maybe through an app on the smartphone (a bit like a stock market app). On the supply side, each patient in this market should be able to profit from high prices for rare valuable types of data. Likewise, competing data extractors such as hospitals will profit from gaining recognition and trust for extracting data well at a reasonable price. The market will make the whole system efficient through incentives for all who are doing a good job. Soon there will be a flourishing ecosystem of commercial data market advisors and what not, just like the ecosystem surrounding the traditional stock market. The value of the data won’t be determined by governments or ethics committees, but by those who own the data and decide by themselves which parts thereof they want to license to others under certain conditions. At first glance, a market-based system seems to be detrimental to the interest of certain monopolistic companies, as they would have to pay for the data - some would prefer free data and keep their monopoly. However, since every healthy and sick person in the market would suddenly have an incentive to collect and share their data under self-chosen anonymity conditions, there will soon be many more useful data to evaluate all kinds of treatments. On average, people will live longer and healthier, and many companies and the entire healthcare system will benefit. Jones: Finally, what is your view on open source versus the private companies like Google and OpenAI? Is there a danger to supporting these private companies’ large language models versus trying to keep these models open source and transparent, very much like what LAION is doing? Schmidhuber: I signed this open letter by LAION because I strongly favor the open-source movement. And I think it's also something that is going to challenge whatever big tech dominance there might be at the moment. Sure, the best models today are run by big companies with huge budgets for computers, but the exciting fact is that open-source models are not so far behind, some people say maybe six to eight months only. Of course, the private company models are all based on stuff that was created in academia, often in little labs without so much funding, which publish without patenting their results and open source their code and others take it and improved it. Big tech has profited tremendously from academia; their main achievement being that they have scaled up everything greatly, sometimes even failing to credit the original inventors. So, it's very interesting to see that as soon as some big company comes up with a new scaled-up model, lots of students out there are competing, or collaborating, with each other, trying to come up with equal or better performance on smaller networks and smaller machines. And since they are open sourcing, the next guy can have another great idea to improve it, so now there’s tremendous competition also for the big companies. Because of that, and since AI is still getting exponentially cheaper all the time, I don't believe that big tech companies will dominate in the long run. They find it very hard to compete with the enormous open-source movement. As long as you can encourage the open-source community, I think you shouldn't worry too much. Now, of course, you might say if everything is open source, then the bad actors also will more easily have access to these AI tools. And there's truth to that. But as always since the invention of controlled fire, it was good that knowledge about how technology works quickly became public such that everybody could use it. And then, against any bad actor, there's almost immediately a counter actor trying to nullify his efforts. You see, I still believe in our old motto "AI∀" or "AI For All." Jones: Thank you, Juergen for sharing your perspective on this amazing time in history. It’s clear that with new technology, the enormous potential can be matched by disparate and troubling risks which we’ve yet to solve, and even those we have yet to identify. If we are to dispel the fear of a sentient system for which we have no control, humans, alone need to take steps for more responsible development and collaboration to ensure AI technology is used to ultimately benefit society. Humanity will be judged by what we do next.

[D] Why is the AI Hype Absolutely Bonkers
reddit
LLM Vibe Score0
Human Vibe Score1
good_riceThis week

[D] Why is the AI Hype Absolutely Bonkers

Edit 2: Both the repo and the post were deleted. Redacting identifying information as the author has appeared to make rectifications, and it’d be pretty damaging if this is what came up when googling their name / GitHub (hopefully they’ve learned a career lesson and can move on). TL;DR: A PhD candidate claimed to have achieved 97% accuracy for coronavirus from chest x-rays. Their post gathered thousands of reactions, and the candidate was quick to recruit branding, marketing, frontend, and backend developers for the project. Heaps of praise all around. He listed himself as a Director of XXXX (redacted), the new name for his project. The accuracy was based on a training dataset of ~30 images of lesion / healthy lungs, sharing of data between test / train / validation, and code to train ResNet50 from a PyTorch tutorial. Nonetheless, thousands of reactions and praise from the “AI | Data Science | Entrepreneur” community. Original Post: I saw this post circulating on LinkedIn: https://www.linkedin.com/posts/activity-6645711949554425856-9Dhm Here, a PhD candidate claims to achieve great performance with “ARTIFICIAL INTELLIGENCE” to predict coronavirus, asks for more help, and garners tens of thousands of views. The repo housing this ARTIFICIAL INTELLIGENCE solution already has a backend, front end, branding, a README translated in 6 languages, and a call to spread the word for this wonderful technology. Surely, I thought, this researcher has some great and novel tech for all of this hype? I mean dear god, we have branding, and the author has listed himself as the founder of an organization based on this project. Anything with this much attention, with dozens of “AI | Data Scientist | Entrepreneur” members of LinkedIn praising it, must have some great merit, right? Lo and behold, we have ResNet50, from torchvision.models import resnet50, with its linear layer replaced. We have a training dataset of 30 images. This should’ve taken at MAX 3 hours to put together - 1 hour for following a tutorial, and 2 for obfuscating the training with unnecessary code. I genuinely don’t know what to think other than this is bonkers. I hope I’m wrong, and there’s some secret model this author is hiding? If so, I’ll delete this post, but I looked through the repo and (REPO link redacted) that’s all I could find. I’m at a loss for thoughts. Can someone explain why this stuff trends on LinkedIn, gets thousands of views and reactions, and gets loads of praise from “expert data scientists”? It’s almost offensive to people who are like ... actually working to treat coronavirus and develop real solutions. It also seriously turns me off from pursuing an MS in CV as opposed to CS. Edit: It turns out there were duplicate images between test / val / training, as if ResNet50 on 30 images wasn’t enough already. He’s also posted an update signed as “Director of XXXX (redacted)”. This seems like a straight up sleazy way to capitalize on the pandemic by advertising himself to be the head of a made up organization, pulling resources away from real biomedical researchers.

[R] From 3D Contour Plots to AI-Generated Art
reddit
LLM Vibe Score0
Human Vibe Score1
MLRecipesThis week

[R] From 3D Contour Plots to AI-Generated Art

Fun tutorial to learn how to make professional contour plots in Python, with incredible animated visualizations. At the intersection of machine learning, scientific computing, automated art, cartography, and video games. Section 3 is particularly interesting, as it shows all the work behind the scene, to complete this project in 20 hours when you have no idea how to start. https://reddit.com/link/ycg6c6/video/kycotrx09sv91/player There is far more than just creating 3D contour plots in this article. First, you will learn how to produce data videos. I have shared quite a few in the past (with source code), but this is probably the simplest example. The data video also illustrates that a mixture of Gaussian-like distributions is typically non Gaussian-like, and may or may not be unimodal. It is borderline art (automatically generated), and certainly a stepping stone for professionals interested in computer vision or designing video games. It is easy to image a game based on my video, entitled “flying above menacingly rising mountains”. Then the data video, through various rotations, give you a much better view of your data. It is also perfect to show systems that evolve over time: a time series where each observation is an image. In addition, unlike most tutorials found online, this one does a rather deep dive on a specific, rather advanced function from a library truly aimed at scientific computing. In the same way that images (say pictures of hand-written digits) can be summarized by 10 parameters to perform text recognition, here 20 parameters allow you to perform topography classification. Not just of static terrain, but terrain that changes over time, assuming you have access to 50,000 videos representing different topographies. You can produce the videos needed for supervised classification with the code in section 2. The next step is to use data (videos) from the real world, and used the model trained on synthetic data for classification. Read the full article with illustration (data video) and Python code, here.

[R] Stanford HAI Spring Conference - Intelligence Augmentation: AI Empowering People to Solve Global Challenges
reddit
LLM Vibe Score0
Human Vibe Score0
othotrThis week

[R] Stanford HAI Spring Conference - Intelligence Augmentation: AI Empowering People to Solve Global Challenges

Stanford Institute for Human-Centered AI hosted its spring conference today with interesting conversations about how AI can best support humans in healthcare, art, and education to address global challenges. More details and the event recording are available at the HAI conference site. Here is a quick outline with video sections: Welcome & Introductions: HAI directors Fei-Fei Li, John Etchemendy, Russ Altman, & James Landay Session I: Healthcare Immersive Technologies for Caregiving: Innovation Opportunities and Ecosystem Challenges, Deborah Estrin @ Cornell Tech Student Lightning Talks On Complementing and Extending Human Intellect: Principles and Directions, Eric Horvitz @ Microsoft Mobilizing AI to Achieve Healthy Child Development Worldwide, Dennis Wall @ Stanford Safer and Proactive Care through AI, Suchi Saria @ Johns Hopkins University Session II: Art Other Intelligence: Exoticism and AI, Ken Goldberg @ UC Berkeley Student Lightning Talks Artful Intelligence: Exoticism and AI, Michele Elam @ Stanford The Digital Griot: A Reimagining of the Archive, Rashaad Newsome @ Stanford Amplifying the Human Artist Through AI, Hilary Hahn & Carol Reiley @ DeepMusic.ai Session III: Education Escaping or Automating a Legacy of Bad Instruction, Daniel Schwartz @ Stanford Student Lightning Talks AI to Super Power Teachers, Chris Piech @ Stanford Pushing the Boundaries of Educational Technology, Amy Ogan @ Carnegie Mellon University AI to Accelerate Workplace Learning at Scale, Candace Thille @ Amazon https://preview.redd.it/p2qg7eutibp61.png?width=1928&format=png&auto=webp&s=1cc8dd6c4458c3da79d00415552ca4424f03d0c2

[D] Should We Be Concerned About The Failure Of Evolutionary Algorithms, And Its Implications?
reddit
LLM Vibe Score0
Human Vibe Score-1
mystikaldangerThis week

[D] Should We Be Concerned About The Failure Of Evolutionary Algorithms, And Its Implications?

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6287292/ ​ A number of possible explanations for \[why we can't evolve complex software\] could be considered. We tried to be as comprehensive as possible in this section, but it is possible that we have not considered some plausible explanations: Incompetent programmers—It is theoretically possible, but is highly unlikely, that out of thousands of scientists working on evolutionary computation, all failed to correctly implement the Darwinian algorithm. Nonrepresentative algorithms—Some have suggested that EAs do not accurately capture the theory of evolution, but of course that would imply that the theory itself is not specified in sufficient detail to make falsifiable predictions. If, however, such more detailed specifications are available to GP believers, it is up to them to implement them as computer simulations for testing purposes, but no successful examples of such work are known and the known ones have not been successful in evolving software. Inadequate fitness functions—Fitness function for a complex software product is difficult to outline and specify and may be as complex (or even more complex) as the software we want to evolve as it has to consider all the possible use cases and pass all unit tests. This may be the Achilles heel of GP, but it is also an objection to feasibility of programming in general and GP in particular, as both have to convert software specification into the source code. If human programmers and biological evolution succeed with such constraints, so should Darwinian simulations. The Halting problem—Turing proved that it is impossible to determine whether an arbitrary program halts, but this is also a problem for human programmers and could be easily addressed by placing time limits on considered solutions. Program correctness—If we require evolved software to be provably correct, this would present a problem as GP does not verify produced designs but only tests them against specific unit tests. Likewise, we cannot rely on automated software verification as it is still an unsolved problem in the general case. This is not really a problem as most of the human-written software is never proven to be correct and only a small portion of software engineering process relies of formal specification and Test Driven Development. Inappropriate solutions—Literature on EA is full of examples of surprising creativity of Darwinian algorithm resulting in solutions which match the letter of design specifications but not the spirit. This is similar to human-produced software and numerous examples of ways in which such software fails the goals of the initial design. Insufficient complexity of the environment (not enough data, poor fitness functions)—It is possible that the simulated environment is not complex enough to generate high complexity outputs in evolutionary simulations. This does not seem correct as Internet presents a highly complex landscape in which many self-modifying computer viruses roam. Likewise, virtual world such as Second Life and many others present close approximations to the real world and are certainly more complex than early Earth was: A skeptic might insist that an abstract environment would be inadequate for the evolution . . ., believing instead that the virtual environment would need to closely resemble the actual biological environment in which our ancestors evolved. Creating a physically realistic virtual world would require a far greater investment of computational resources than the simulation of a simple toy world or abstract problem domain (whereas evolution had access to a physically realistic real world “for free”). In the limiting case, if complete microphysical accuracy were insisted upon, the computational requirements would balloon to utterly infeasible proportions. Requiring more realistic environmental conditions may result in an increase in necessary computational resources, a problem addressed in the next bullet. Insufficient resources (compute, memory)—From the history of computer science, we know of many situations (speech recognition, NN training), where we had a correct algorithm but insufficient computational resources to run it to success. It is possible that we simply do not have hardware powerful enough to emulate evolution. We will address this possibility in section “Computational Complexity of Biological Evolution and Available Compute.” Software design is not amenable to evolutionary methods—Space of software designs may be discrete with no continuous path via incremental fitness to the desired solutions. This is possible, but this implies that original goals of GP are unattainable and misguided. In addition, because a clear mapping exists between solutions to problems and animals as solutions to environmental problems, this would also imply that current explanation for the origin of the species is incorrect. Darwinian algorithm is incomplete or wrong—Finally, we have to consider the possibility that the inspiration behind evolutionary computation, the Darwinian algorithm itself is wrong or at least partially incomplete. If that was true, computer simulations of such algorithm would fail to produce results comparable with observations we see in nature and a search for an alternative algorithm would need to take place. This would be an extraordinary claim and would require that we discard all the other possible explanations from this list. We challenge EA community to prove us wrong by producing an experiment, which evolves nontrivial software from scratch and without human help. That would be the only way in which our findings could be shown to be incorrect. Perhaps, reframing the problem in terms of maximizing negentropy of digital organisms, as suggested by Schrödinger, Michaelian, and Ulanowicz and Hannon, with respect to negative energy being a fundamental property of all life-forms may produce better results. On a positive side, the fact that it seems impossible to evolve complex software implies that we are unlikely to be able to evolve highly sophisticated artificially intelligent agents, which may present significant risk to our safety and security. Just imagine what would have happened, if the very first time we ran a simulation of evolution on a computer, it produced a superintelligent agent. Yampolskiy has shown that programming as a problem is AI-complete; if GP can solve programming that would imply that GP = AGI (artificial general intelligence), but we see no experimental evidence for such claim. In fact, it is more likely that once we have AGI, it could be used to create an intelligent fitness function for GP and so evolve software. Genetic programming will not be the cause of AI, but a product of it. However, neuroevolution methods for optimizing deep learning architectures and parameters remain a strong possibility for creation of AGI.

[N] TheSequence Scope: When it comes to machine learning, size matters: Microsoft's DeepSpeed framework, which can train a model with up to a trillion parameters
reddit
LLM Vibe Score0
Human Vibe Score1
KseniaseThis week

[N] TheSequence Scope: When it comes to machine learning, size matters: Microsoft's DeepSpeed framework, which can train a model with up to a trillion parameters

Hi there! Offering to your attention the latest edition of a weekly ML-newsletter that focusing on three things: impactful ML research papers, cool ML tech solutions, and ML use cases supported by investors. Please, see it below. Reddit is a new thing for me, and I've been struggling a bit with it, so please don't judge me too harsh for this promotion. This weekly digest is free and I hope you'd find the format convenient for you. Your feedback is very appreciated, and please feel free to sign up if you like it. 📝 Editorial  The recent emergence of pre-trained language models and transformer architectures pushed the creation of larger and larger machine learning models. Google’s BERT presented attention mechanism and transformer architecture possibilities as the “next big thing” in ML, and the numbers seem surreal. OpenAI’s GPT-2 set a record by processing 1.5 billion parameters, followed by Microsoft’s Turing-NLG, which processed 17 billion parameters just to see the new GPT-3 processing an astonishing 175 billion parameters. To not feel complacent, just this week Microsoft announced a new release of its DeepSpeed framework (which powers Turing-NLG), which can train a model with up to a trillion parameters. That sounds insane but it really isn’t.   What we are seeing is a consequence of several factors. First, computation power and parallelization techniques have evolved to a point where it is relatively easy to train machine learning models in large clusters of machines. Second and most importantly, in the current state of machine learning, larger models have regularly outperformed smaller and more specialized models. Knowledge reusability methods like transfer learning are still in very nascent stages. As a result, it’s really hard to build small models that can operate in uncertain environments. Furthermore, as models like GPT-3 and Turing-NLG have shown, there is some unexplainable magic that happens after models go past a certain size. Many of the immediate machine learning problems might be solved by scaling the current generation of neural network architectures. Plain and simple, when it comes to machine learning, size matters.   We would love to hear your opinions about the debate between broader-larger vs. smaller and more specialized models.   Leave a comment Now, to the most important developments in the AI industry this week 🔎 ML Research GPT-3 Falls Short in Machine Comprehension Proposed by researchers from a few major American universities, a 57-task test to measure models’ ability to reason poses challenges even for sophisticated models like GPT-3 ->read more in the original paper Better Text Summarization OpenAI published a paper showing a reinforcement learning with human feedback technique that can surpass supervised models ->read more on OpenAI blog Reinforcement Learning with Offline Datasets Researchers from the Berkeley AI Research (BAIR) Lab published a paper unveiling a method that uses offline datasets to improve reinforcement learning models->read more on BAIR blog 🤖 Cool AI Tech Releases New Version of DeepSpeed Microsoft open-sourced a new version of DeepSpeed, an open-source library for parallelizable training that can scale up to models with 1 trillion parameters->read more on Microsoft Research blog 💸 Money in AI AI-powered customer experience management platform Sprinklr has raised $200 million (kudos to our subscribers from Sprinklr 👏). Sprinklr's “AI listening processing” solution allows companies to get structured and meaningful sentiments and insights from unstructured customer data that comes from public conversations on different websites and social platforms. Xometry, an on-demand industrial parts marketplace, raises $75 million in Series E funding. The company provides a digital way of creating the right combination of buyers and manufacturers. Another example of AI implementation into matching two sides for a deal. Real estate tech company Orchard raises $69 million in its recent funding round. Orchard aims to digitize the whole real estate market, by developing a solution that combines machine learning and rapid human assistance to smooth the search, match the right deal, and simplify buying and selling relationships. Cybersecurity startup Pcysys raised $25 million in its funding round. Pcysys’ platform, which doesn’t require installation or network reconfiguration, uses algorithms to scan and “ethically” attack enterprise networks. Robotics farming company Iron Ox raised $20 million in a funding round. The system of farming robots is still semi-autonomous, the company’s goal is to become fully autonomous.  Insurtech company Descartes Underwriting raised $18.5 million. The company applies AI and machine learning technologies to climate risk predicting and insurance underwriting. Legaltech startup ThoughtRiver raised $10 million in its Series A round. Its AI solution applied to contract pre-screening aims to boost operational efficiency. Medtech startup Skin Analytics raised $5.1 million in Series A funding. Skin Analytics has developed a clinically validated AI system that can identify not only the important skin cancers but also precancerous lesions that can be treated, as well as a range of lesions that are benign. Amazon, along with several government organizations and three other industry partners, helped fund the National Science Foundation, a high-priority AI research initiative. The amount of funding is not disclosed. The content of TheSequence is written by Jesus Rodriguez, one of the most-read contributors to KDNuggets and TDS. You can check his Medium here.

[N] Last Week in AI News Digest 08/15-08/21: detecting hate speech, dogfight simulation, disaster-response, and more!
reddit
LLM Vibe Score0
Human Vibe Score-0.5
regalalgorithmThis week

[N] Last Week in AI News Digest 08/15-08/21: detecting hate speech, dogfight simulation, disaster-response, and more!

Hi there, we at Skynet Today produce a weekly newsletter summarizing each week's major AI news, which seems like it'd be of interest to this subreddit. Here's what's in our latest one: Facebook’s AI for detecting hate speech is facing its biggest challenge yet Facebook has made significant progress recently to proactively take down content that violate its community standards. For example, in the second quarter of 2020, Facebook took down 104.6 million pieces of content. While reviews are typically performed by a vast workforce of human moderators, AI-powered tools have enabled Facebook to do this work at a greater scale for textual content. However, there’s a long way to go for these systems to match or exceed the capabilities of human moderators. This is because a large proportion of hate speech and misinformation is in the form of images and memes, and reasoning about the context and language-image interplay is an extremely difficult challenge for AI. Given Facebook’s scale and the speed at which some use it to spread hate, incite violence, and share lies with millions, Facebook will have to keep running to catch up. AI Slays Top F-16 Pilot In DARPA Dogfight Simulation The Defense Advanced Research Project Agency (DARPA) recently hosted a simulated F16 dogfight competition, with different AI bots competing with each other as well as with human pilots. The top AI bot was able to beat a human pilot 5-0 in the simulated contest. DARPA started this program “as a risk-reduction effort \[…\] to flesh out how human and machine pilots share operational control of a fighter jet to maximize its chances of mission success.” Competition runners are broadly optimistic about the demonstration of AI capabilities, even if they are not close to being deployed on a real aircraft. Of concern, the program had little discussion on the ethics of AI military applications, especially with the lethal autonomous weapon systems being considered. News Advances & Business Microsoft, Energy Dept. to Develop Disaster-Response AI Tools \- The U.S. Department of Energy and Microsoft Corp. on Tuesday announced a partnership to develop artificial-intelligence tools aimed at helping first-responders better react to fast-changing natural events, such as floods and wildfires. Coronavirus: Robot CERi is a bilingual Covid-19 expert \- Ceri is bilingual, clued-up on coronavirus and can tell what mood you are in. Ceri also happens to be a robot. Moscow DOH uses AI platform to detect lung cancer symptoms \- Moscow’s department of health is using an artificial intelligence (AI) platform to detect symptoms of lung cancer in CT scans, as part of a project to implement AI technology for radiology. Scientists develop artificial intelligence system for high precision recognition of hand gestures \- The recognition of human hand gestures by AI systems has been a valuable development over the last decade and has been adopted in high-precision surgical robots, health monitoring equipment and in gaming systems. Forget credit cards - now you can pay with your face. Creepy or cool? \- A new way to pay has arrived in Los Angeles: your face. Concerns & Hype The dystopian tech that companies are selling to help schools reopen sooner \- This fall, AI could be watching students social distance and checking their masks. Thousands of schools nationwide will not be reopening this fall. NYPD Used Facial Recognition Technology In Siege Of Black Lives Matter Activist’s Apartment \- The NYPD deployed facial recognition technology in its hunt for a prominent Black Lives Matter activist, whose home was besieged by dozens of officers and police dogs last week, a spokesperson confirmed to Gothamist. Machines can spot mental health issues - if you hand over your personal data \- Digital diagnosis could transform psychiatry by mining your most intimate data for clues. But is the privacy cost worth it? Supporting Black Artists Who Are Examining AI \- Technology has a complicated relationship with racial justice. Smartphones, internet platforms, and other digital tools can be used to document and expose racism. But digital tools can also fuel racism: smart doorbells surveil Black individuals. A-level and GCSE results in England to be based on teacher assessments in U-turn \- All A-level and GCSE results in England will be based on grades assesed by teachers instead of algorithms. Analysis & Policy GPT-3 and The Question of Automation \- Automation is not an all or nothing proposition. An AI model’s automation capability is highly conjoined with the task and application it is used in. An A.I. Movie Service Could One Day Serve You a New Custom Film Every Time \- How long will it be until an A.I. can make an actual feature film on demand? Fairness, evidence, and predictive equality \- How the causal fairness principle relates to predictive equality How robotics and automation could create new jobs in the new normal \- Depending on who you ask, AI and automation will either destroy jobs or create new ones. In reality, a greater push toward automation will probably both kill and create jobs - human workers will become redundant in certain spheres, sure, but many new roles will likely crop up. Expert Opinions & Discussion within the field Too many AI researchers think real-world problems are not relevant \- The community’s hyperfocus on novel methods ignores what’s really important.

[N] Ethan Caballero: Broken Neural Scaling Laws | New Podcast Episode
reddit
LLM Vibe Score0
Human Vibe Score0
evc123This week

[N] Ethan Caballero: Broken Neural Scaling Laws | New Podcast Episode

video: https://www.youtube.com/watch?v=SV87S38M1J4 OUTLINE: 00:00 Introduction 00:50 The "Scale Is All You Need" Movement 01:07 A Functional Form Predicting Every Scaling Behavior 01:40 A Break Between Two Straight Lines On A Log Log Plot 02:32 The Broken Neural Scaling Laws Equation 04:04 Extrapolating A Ton Of Large Scale Vision And Language Tasks 04:49 Upstream And Downstream Have Different Breaks 05:22 Extrapolating Four Digit Addition Performance 06:11 On The Feasability Of Running Enough Training Runs 06:31 Predicting Sharp Left Turns 07:51 Modeling Double Descent 08:41 Forecasting Interpretability And Controllability 09:33 How Deception Might Happen In Practice 10:24 Sinister Stumbles And Treacherous Turns 11:18 Recursive Self Improvement Precedes Sinister Stumbles 11:51 Humans In The Loop For The Very First Deception 12:32 The Hardware Stuff Is Going To Come After The Software Stuff 12:57 Distributing Your Training By Copy-Pasting Yourself Into Different Servers 13:42 Automating The Entire Hardware Pipeline 14:47 Having Text AGI Spit Out New Robotics Design 16:33 The Case For Existential Risk From AI 18:32 Git Re-basin 18:54 Is Chain-Of-Thoughts Enough For Complex Reasoning In LMs? 19:52 Why Diffusion Models Outperform Other Generative Models 21:13 Using Whisper To Train GPT4 22:33 Text To Video Was Only Slightly Impressive 23:29 The e=mc\^2 of AGI transcript: https://theinsideview.ai/ethan2

[N] Last Week in AI News Digest - Automated chemical synthesis, using heartbeats to detect deepfakes, and more!
reddit
LLM Vibe Score0
Human Vibe Score1
regalalgorithmThis week

[N] Last Week in AI News Digest - Automated chemical synthesis, using heartbeats to detect deepfakes, and more!

Hi there, just sharing the latest edition of our AI news digest newsletter! We're just a couple of AI grad students doing this for fun, so hope the self promotion is not too annoying (also, welcome feedback). See it below, and feel free to subscribe. Mini Briefs Robotics, AI, and Cloud Computing Combine to Supercharge Chemical and Drug Synthesis IBM recently demoed a complex system for chemical testing and drug synthesis. The system has an AI component that predicts the results of chemical reactions, and a fully automated robotic experiment setup that runs chemical tests 24/7. Users can access the remote robotics lab online, and IBM can also install the system on-premise. With these tools working together, IBM is hoping to reduce typical drug discovery and verification time by half. AI researchers use heartbeat detection to identify deepfake videos Researchers from multiple groups are tackling the challenge of detecting deepfake videos by analyzing the apparent heartbeat of the people depicted in the video. This is possible, because a person’s blood flow changes their skin color ever so slightly, and this change is often detectable via a process called photoplethysmography (PPG). Because deepfakes are not currently optimizing to generate realisitic heartbeats, temporal or spatial anomalies in PPG signals allow resesarchers to detect deepfakes with a 97% accuracy. Advances & Business This AI Expert From Senegal Is Helping Showcase Africans In STEM \- Adji Bousso Dieng will be Princeton’s School of Engineering’s first Black female faculty. Google’s AI-powered flood alerts now cover all of India and parts of Bangladesh \- India, the world’s second most populated nation, sees more than 20% of the global flood-related fatalities each year as overrun riverbanks sweep tens of thousands of homes with them. Two years ago, Google volunteered to help. Finding magnetic eruptions in space with an AI assistant \- MMS look for explosive reconnection events as it flies through the magnetopause - the boundary region where Earth’s magnetic butts up against the solar wind that flows throughout the solar system. This know-it-all AI learns by reading the entire web nonstop \- Diffbot is building the biggest-ever knowledge graph by applying image recognition and natural-language processing to billions of web pages. Bosch and Ford will test autonomous parking in Detroit \- Ford, Bosch, and Dan Gilbert’s real estate firm Bedrock today detailed an autonomous parking pilot scheduled to launch in September at The Assembly, a mixed-used building in Detroit’s Corktown neighborhood. Create your own moody quarantine music with Google’s AI \- Lo-Fi Player, the latest project out of Google Magenta, lets you mix tunes with the help of machine learning by interacting with a virtual room. Apple launches AI/ML residency program to attract niche experts \- As Apple’s artificial language and machine learning initiatives continue to expand, its interest in attracting talent has grown - a theme that’s barely under the surface of the company’s occasionally updated Machine Learning Research blog. Dusty Robotics CEO Tessa Lau Discusses Robotics Start-Ups and Autonomous Robots for Construction \- Tessa Lau is Founder/CEO at Dusty Robotics, whose mission is to increase construction industry productivity by introducing robotic automation on the jobsite. Concerns & Hype Google Offers to Help Others With the Tricky Ethics of AI \- Companies pay cloud computing providers like Amazon, Microsoft, and Google big money to avoid operating their own digital infrastructure. The Peace Dividends Of The Autonomous Vehicle Wars \- The rapid growth of the mobile market in the late 2000s and early 2010s led to a burst of technological progress. Ethics must be part of the development process’ \- The increasing use of AI (artificial intelligence) in the development of new medical technologies demands greater attention to ethical aspects. Analysis & Policy China’s new AI trade rules could hamper a TikTok sale \- TikTok’s attempt to sell itself and avert a possible US ban may run into some complications. The Wall Street Journal reports that China has unveiled new restrictions on AI technology exports that could affect TikTok. Podcast Check out our weekly podcast covering these stories! Website | RSS | iTunes | Spotify | YouTube

[N] AI Robotics startup Covariant (founded by Peter Chen, Pieter Abbeel, other Berkeley / ex-OpenAI folks) just raised $40M in Series B funding round. “Covariant has recently seen increased usage from clients hoping to avoid supply chain disruption due to the coronavirus pandemic.”
reddit
LLM Vibe Score0
Human Vibe Score0
baylearnThis week

[N] AI Robotics startup Covariant (founded by Peter Chen, Pieter Abbeel, other Berkeley / ex-OpenAI folks) just raised $40M in Series B funding round. “Covariant has recently seen increased usage from clients hoping to avoid supply chain disruption due to the coronavirus pandemic.”

h/t their announcement, VB and WSJ article: Logistics AI Startup Covariant Reaps $40 Million in Funding Round Company plans to explore uses of machine learning for automation beyond warehouse operations Artificial-intelligence robotics startup Covariant raised $40 million to expand its logistics automation technology to new industries and ramp up hiring, the company said Wednesday. The Berkeley, Calif.-based company makes AI software that it says helps warehouse robots pick objects at a faster rate than human workers, with a roughly 95% accuracy rate. Covariant is working with Austrian logistics-automation company Knapp AG and the robotics business of Swiss industrial conglomerate ABB Ltd., which provide hardware such as robot arms or conveyor belts to pair with the startup’s technology platform. “What we’ve built is a universal brain for robotic manipulation tasks,” Covariant co-founder and Chief Executive Peter Chen said in an interview. “We provide the software, they provide the rest of the systems.” Logistics-sector appetite for such technology is growing as distribution and fulfillment operations that have relied on human labor look to speed output and meet rising digital commerce demand. The coronavirus pandemic has accelerated that interest as businesses have sought to adjust their operations to volatile swings in consumer demand and to new restrictions, such as spacing workers further apart to guard against contagion. That has provided a bright spot for some technology startups even as many big backers scale back venture-capital spending. Last month logistics delivery platform Bringg said it raised $30 million in a Series D funding round, for example, as demand for home delivery of food, household goods and e-commerce staples soared among homebound consumers. Covariant’s Series B round brings the company’s total funding to $67 million. New investor Index Ventures led the round, with participation from existing investor Amplify Partners and new investors including Radical Ventures. Mr. Chen said the funding will be used to explore the technology’s potential application in other markets such as manufacturing, recycling or agriculture “where there are repetitive manual processes.” Covariant also plans to hire more engineering and other staff, he said. Covariant was founded in 2017 and now has about 50 employees. The company’s technology uses camera systems to capture images of objects, and artificial intelligence to analyze objects and how to pick them up. Machine learning helps Covariant-powered robots learn from experience. The startup’s customers include a German electrical supplies wholesaler that uses the technology to control a mechanical arm that picks out orders of circuit boards, switches and other goods.

DARPA "AI For Critical Minerals Assessment" Competition [D]
reddit
LLM Vibe Score0
Human Vibe Score0
Scherzers_Brown_EyeThis week

DARPA "AI For Critical Minerals Assessment" Competition [D]

DARPA is hosting a competition called “AI for Critical Mineral Assessments,” which is looking for solutions to automatically extract and georeferenced features from scanned or raster maps. The U.S. Geological Survey uses data from these assessments to build reports that can eventually lead to increasing domestic production of critical minerals and reducing U.S. reliance on imports. The competition includes two independent challenges: Map Georeferencing Challenge: Automated map georeferencing is a difficult task as most USGS maps are not digitized, and may be in a multitude of historical coordinate projection systems. Furthermore, the quality of features on scanned maps, critical for the identification of control points for alignment, can vary greatly. Participants will receive a dataset of 1,000 or more maps of various types for training and validation. The goal of this challenge is to accurately geolocate a map of unknown location and coordinate system by fitting coordinate points that can be referenced to known locations in one or more base maps. Register now-Aug. 26. Map Feature Extraction Challenge: Automated map feature extraction is a difficult task because map features (polygons, points, lines, text) often overlap and are sometimes discontinuous. Not only do the features come in all shapes and sizes, but the same feature type can be depicted in different maps using different symbols or patterns. This makes it challenging to create a universal identifier for even a single feature such as a mine location or mineral resource tracts. Participants will be provided a training set consisting of maps with each legend item labeled and characterized (as point, line, or polygon) and a binary pixel map reflecting the feature’s coverage in the map. The goal of the challenge is to identify all features in a map that appear in the map’s legend. Register Sept. 5 - 16. For each of the two challenges, DARPA will award: · $10,000 for the first prize · $3,000 for the second prize · $1,000 for the third prize You can visit criticalminerals.darpa.mil for complete details on how you can compete.

[P]MMML | Deploy HuggingFace training model rapidly based on MetaSpore
reddit
LLM Vibe Score0
Human Vibe Score1
qazmkoppThis week

[P]MMML | Deploy HuggingFace training model rapidly based on MetaSpore

A few days ago, HuggingFace announced a $100 million Series C funding round, which was big news in open source machine learning and could be a sign of where the industry is headed. Two days before the HuggingFace funding announcement, open-source machine learning platform MetaSpore released a demo based on the HuggingFace Rapid deployment pre-training model. As deep learning technology makes innovative breakthroughs in computer vision, natural language processing, speech understanding, and other fields, more and more unstructured data are perceived, understood, and processed by machines. These advances are mainly due to the powerful learning ability of deep learning. Through pre-training of deep models on massive data, the models can capture the internal data patterns, thus helping many downstream tasks. With the industry and academia investing more and more energy in the research of pre-training technology, the distribution warehouses of pre-training models such as HuggingFace and Timm have emerged one after another. The open-source community release pre-training significant model dividends at an unprecedented speed. In recent years, the data form of machine modeling and understanding has gradually evolved from single-mode to multi-mode, and the semantic gap between different modes is being eliminated, making it possible to retrieve data across modes. Take CLIP, OpenAI’s open-source work, as an example, to pre-train the twin towers of images and texts on a dataset of 400 million pictures and texts and connect the semantics between pictures and texts. Many researchers in the academic world have been solving multimodal problems such as image generation and retrieval based on this technology. Although the frontier technology through the semantic gap between modal data, there is still a heavy and complicated model tuning, offline data processing, high performance online reasoning architecture design, heterogeneous computing, and online algorithm be born multiple processes and challenges, hindering the frontier multimodal retrieval technologies fall to the ground and pratt &whitney. DMetaSoul aims at the above technical pain points, abstracting and uniting many links such as model training optimization, online reasoning, and algorithm experiment, forming a set of solutions that can quickly apply offline pre-training model to online. This paper will introduce how to use the HuggingFace community pre-training model to conduct online reasoning and algorithm experiments based on MetaSpore technology ecology so that the benefits of the pre-training model can be fully released to the specific business or industry and small and medium-sized enterprises. And we will give the text search text and text search graph two multimodal retrieval demonstration examples for your reference. Multimodal semantic retrieval The sample architecture of multimodal retrieval is as follows: Our multimodal retrieval system supports both text search and text search application scenarios, including offline processing, model reasoning, online services, and other core modules: ​ https://preview.redd.it/w4v4c7vcez291.png?width=1834&format=png&auto=webp&s=0687efb1fddb26e8e30cb844d398ec712b947f31 Offline processing, including offline data processing processes for different application scenarios of text search and text search, including model tuning, model export, data index database construction, data push, etc. Model inference. After the offline model training, we deployed our NLP and CV large models based on the MetaSpore Serving framework. MetaSpore Serving helps us conveniently perform online inference, elastic scheduling, load balancing, and resource scheduling in heterogeneous environments. Online services. Based on MetaSpore’s online algorithm application framework, MetaSpore has a complete set of reusable online search services, including Front-end retrieval UI, multimodal data preprocessing, vector recall and sorting algorithm, AB experimental framework, etc. MetaSpore also supports text search by text and image scene search by text and can be migrated to other application scenarios at a low cost. The HuggingFace open source community has provided several excellent baseline models for similar multimodal retrieval problems, which are often the starting point for actual optimization in the industry. MetaSpore also uses the pre-training model of the HuggingFace community in its online services of searching words by words and images by words. Searching words by words is based on the semantic similarity model of the question and answer field optimized by MetaSpore, and searching images by words is based on the community pre-training model. These community open source pre-training models are exported to the general ONNX format and loaded into MetaSpore Serving for online reasoning. The following sections will provide a detailed description of the model export and online retrieval algorithm services. The reasoning part of the model is standardized SAAS services with low coupling with the business. Interested readers can refer to my previous post: The design concept of MetaSpore, a new generation of the one-stop machine learning platform. 1.1 Offline Processing Offline processing mainly involves the export and loading of online models and index building and pushing of the document library. You can follow the step-by-step instructions below to complete the offline processing of text search and image search and see how the offline pre-training model achieves reasoning at MetaSpore. 1.1.1 Search text by text Traditional text retrieval systems are based on literal matching algorithms such as BM25. Due to users’ diverse query words, a semantic gap between query words and documents is often encountered. For example, users misspell “iPhone” as “Phone,” and search terms are incredibly long, such as “1 \~ 3 months old baby autumn small size bag pants”. Traditional text retrieval systems will use spelling correction, synonym expansion, search terms rewriting, and other means to alleviate the semantic gap but fundamentally fail to solve this problem. Only when the retrieval system fully understands users’ query terms and documents can it meet users’ retrieval demands at the semantic level. With the continuous progress of pre-training and representational learning technology, some commercial search engines continue to integrate semantic vector retrieval methods based on symbolic learning into the retrieval ecology. Semantic retrieval model This paper introduces a set of semantic vector retrieval applications. MetaSpore built a set of semantic retrieval systems based on encyclopedia question and answer data. MetaSpore adopted the Sentence-Bert model as the semantic vector representation model, which fine-tunes the twin tower BERT in supervised or unsupervised ways to make the model more suitable for retrieval tasks. The model structure is as follows: The query-Doc symmetric two-tower model is used in text search and question and answer retrieval. The vector representation of online Query and offline DOC share the same vector representation model, so it is necessary to ensure the consistency of the offline DOC library building model and online Query inference model. The case uses MetaSpore’s text representation model Sbert-Chinese-QMC-domain-V1, optimized in the open-source semantically similar data set. This model will express the question and answer data as a vector in offline database construction. The user query will be expressed as a vector by this model in online retrieval, ensuring that query-doc in the same semantic space, users’ semantic retrieval demands can be guaranteed by vector similarity metric calculation. Since the text presentation model does vector encoding for Query online, we need to export the model for use by the online service. Go to the q&A data library code directory and export the model concerning the documentation. In the script, Pytorch Tracing is used to export the model. The models are exported to the “./export “directory. The exported models are mainly ONNX models used for wired reasoning, Tokenizer, and related configuration files. The exported models are loaded into MetaSpore Serving by the online Serving system described below for model reasoning. Since the exported model will be copied to the cloud storage, you need to configure related variables in env.sh. \Build library based on text search \ The retrieval database is built on the million-level encyclopedia question and answer data set. According to the description document, you need to download the data and complete the database construction. The question and answer data will be coded as a vector by the offline model, and then the database construction data will be pushed to the service component. The whole process of database construction is described as follows: Preprocessing, converting the original data into a more general JSonline format for database construction; Build index, use the same model as online “sbert-Chinese-qmc-domain-v1” to index documents (one document object per line); Push inverted (vector) and forward (document field) data to each component server. The following is an example of the database data format. After offline database construction is completed, various data are pushed to corresponding service components, such as Milvus storing vector representation of documents and MongoDB storing summary information of documents. Online retrieval algorithm services will use these service components to obtain relevant data. 1.1.2 Search by text Text and images are easy for humans to relate semantically but difficult for machines. First of all, from the perspective of data form, the text is the discrete ID type of one-dimensional data based on words and words. At the same time, images are continuous two-dimensional or three-dimensional data. Secondly, the text is a subjective creation of human beings, and its expressive ability is vibrant, including various turning points, metaphors, and other expressions, while images are machine representations of the objective world. In short, bridging the semantic gap between text and image data is much more complex than searching text by text. The traditional text search image retrieval technology generally relies on the external text description data of the image or the nearest neighbor retrieval technology and carries out the retrieval through the image associated text, which in essence degrades the problem to text search. However, it will also face many issues, such as obtaining the associated text of pictures and whether the accuracy of text search by text is high enough. The depth model has gradually evolved from single-mode to multi-mode in recent years. Taking the open-source project of OpenAI, CLIP, as an example, train the model through the massive image and text data of the Internet and map the text and image data into the same semantic space, making it possible to implement the text and image search technology based on semantic vector. CLIP graphic model The text search pictures introduced in this paper are implemented based on semantic vector retrieval, and the CLIP pre-training model is used as the two-tower retrieval architecture. Because the CLIP model has trained the semantic alignment of the twin towers’ text and image side models on the massive graphic and text data, it is particularly suitable for the text search graph scene. Due to the different image and text data forms, the Query-Doc asymmetric twin towers model is used for text search image retrieval. The image-side model of the twin towers is used for offline database construction, and the text-side model is used for the online return. In the final online retrieval, the database data of the image side model will be searched after the text side model encodes Query, and the CLIP pre-training model guarantees the semantic correlation between images and texts. The model can draw the graphic pairs closer in vector space by pre-training on a large amount of visual data. Here we need to export the text-side model for online MetaSpore Serving inference. Since the retrieval scene is based on Chinese, the CLIP model supporting Chinese understanding is selected. The exported content includes the ONNX model used for online reasoning and Tokenizer, similar to the text search. MetaSpore Serving can load model reasoning through the exported content. Build library on Image search You need to download the Unsplash Lite library data and complete the construction according to the instructions. The whole process of database construction is described as follows: Preprocessing, specify the image directory, and then generate a more general JSOnline file for library construction; Build index, use OpenAI/Clip-Vit-BASE-Patch32 pre-training model to index the gallery, and output one document object for each line of index data; Push inverted (vector) and forward (document field) data to each component server. Like text search, after offline database construction, relevant data will be pushed to service components, called by online retrieval algorithm services to obtain relevant data. 1.2 Online Services The overall online service architecture diagram is as follows: https://preview.redd.it/jfsl8hdfez291.png?width=1280&format=png&auto=webp&s=a858e2304a0c93e78ba5429612ca08cbee69b35a Multi-mode search online service system supports application scenarios such as text search and text search. The whole online service consists of the following parts: Query preprocessing service: encapsulate preprocessing logic (including text/image, etc.) of pre-training model, and provide services through gRPC interface; Retrieval algorithm service: the whole algorithm processing link includes AB experiment tangent flow configuration, MetaSpore Serving call, vector recall, sorting, document summary, etc.; User entry service: provides a Web UI interface for users to debug and track down problems in the retrieval service. From a user request perspective, these services form invocation dependencies from back to front, so to build up a multimodal sample, you need to run each service from front to back first. Before doing this, remember to export the offline model, put it online and build the library first. This article will introduce the various parts of the online service system and make the whole service system step by step according to the following guidance. See the ReadME at the end of this article for more details. 1.2.1 Query preprocessing service Deep learning models tend to be based on tensors, but NLP/CV models often have a preprocessing part that translates raw text and images into tensors that deep learning models can accept. For example, NLP class models often have a pre-tokenizer to transform text data of string type into discrete tensor data. CV class models also have similar processing logic to complete the cropping, scaling, transformation, and other processing of input images through preprocessing. On the one hand, considering that this part of preprocessing logic is decoupled from tensor reasoning of the depth model, on the other hand, the reason of the depth model has an independent technical system based on ONNX, so MetaSpore disassembled this part of preprocessing logic. NLP pretreatment Tokenizer has been integrated into the Query pretreatment service. MetaSpore dismantlement with a relatively general convention. Users only need to provide preprocessing logic files to realize the loading and prediction interface and export the necessary data and configuration files loaded into the preprocessing service. Subsequent CV preprocessing logic will also be integrated in this manner. The preprocessing service currently provides the gRPC interface invocation externally and is dependent on the Query preprocessing (QP) module in the retrieval algorithm service. After the user request reaches the retrieval algorithm service, it will be forwarded to the service to complete the data preprocessing and continue the subsequent processing. The ReadMe provides details on how the preprocessing service is started, how the preprocessing model exported offline to cloud storage enters the service, and how to debug the service. To further improve the efficiency and stability of model reasoning, MetaSpore Serving implements a Python preprocessing submodule. So MetaSpore can provide gRPC services through user-specified preprocessor.py, complete Tokenizer or CV-related preprocessing in NLP, and translate requests into a Tensor that deep models can handle. Finally, the model inference is carried out by MetaSpore, Serving subsequent sub-modules. Presented here on the lot code: https://github.com/meta-soul/MetaSpore/compare/add\python\preprocessor 1.2.2 Retrieval algorithm services Retrieval algorithm service is the core of the whole online service system, which is responsible for the triage of experiments, the assembly of algorithm chains such as preprocessing, recall, sorting, and the invocation of dependent component services. The whole retrieval algorithm service is developed based on the Java Spring framework and supports multi-mode retrieval scenarios of text search and text search graph. Due to good internal abstraction and modular design, it has high flexibility and can be migrated to similar application scenarios at a low cost. Here’s a quick guide to configuring the environment to set up the retrieval algorithm service. See ReadME for more details: Install dependent components. Use Maven to install the online-Serving component Search for service configurations. Copy the template configuration file and replace the MongoDB, Milvus, and other configurations based on the development/production environment. Install and configure Consul. Consul allows you to synchronize the search service configuration in real-time, including cutting the flow of experiments, recall parameters, and sorting parameters. The project’s configuration file shows the current configuration parameters of text search and text search. The parameter modelName in the stage of pretreatment and recall is the corresponding model exported in offline processing. Start the service. Once the above configuration is complete, the retrieval service can be started from the entry script. Once the service is started, you can test it! For example, for a user with userId=10 who wants to query “How to renew ID card,” access the text search service. 1.2.3 User Entry Service Considering that the retrieval algorithm service is in the form of the API interface, it is difficult to locate and trace the problem, especially for the text search image scene can intuitively display the retrieval results to facilitate the iterative optimization of the retrieval algorithm. This paper provides a lightweight Web UI interface for text search and image search, a search input box, and results in a display page for users. Developed by Flask, the service can be easily integrated with other retrieval applications. The service calls the retrieval algorithm service and displays the returned results on the page. It’s also easy to install and start the service. Once you’re done, go to http://127.0.0.1:8090 to see if the search UI service is working correctly. See the ReadME at the end of this article for details. Multimodal system demonstration The multimodal retrieval service can be started when offline processing and online service environment configuration have been completed following the above instructions. Examples of textual searches are shown below. Enter the entry of the text search map application, enter “cat” first, and you can see that the first three digits of the returned result are cats: https://preview.redd.it/0n5nuyvhez291.png?width=1280&format=png&auto=webp&s=1e9c054f541d53381674b8d6001b4bf524506bd2 If you add a color constraint to “cat” to retrieve “black cat,” you can see that it does return a black cat: https://preview.redd.it/rzc0qjyjez291.png?width=1280&format=png&auto=webp&s=d5bcc503ef0fb3360c7740e60e295cf372dcad47 Further, strengthen the constraint on the search term, change it to “black cat on the bed,” and return results containing pictures of a black cat climbing on the bed: ​ https://preview.redd.it/c4b2q8olez291.png?width=1280&format=png&auto=webp&s=4f3817b0b9f07e1e68d1d4a8281702ba3834a00a The cat can still be found through the text search system after the color and scene modification in the above example. Conclusion The cutting-edge pre-training technology can bridge the semantic gap between different modes, and the HuggingFace community can greatly reduce the cost for developers to use the pre-training model. Combined with the technological ecology of MetaSpore online reasoning and online microservices provided by DMetaSpore, the pre-training model is no longer mere offline dabbling. Instead, it can truly achieve end-to-end implementation from cutting-edge technology to industrial scenarios, fully releasing the dividends of the pre-training large model. In the future, DMetaSoul will continue to improve and optimize the MetaSpore technology ecosystem: More automated and wider access to HuggingFace community ecology. MetaSpore will soon release a common model rollout mechanism to make HuggingFace ecologically accessible and will later integrate preprocessing services into online services. Multi-mode retrieval offline algorithm optimization. For multimodal retrieval scenarios, MetaSpore will continuously iteratively optimize offline algorithm components, including text recall/sort model, graphic recall/sort model, etc., to improve the accuracy and efficiency of the retrieval algorithm. For related code and reference documentation in this article, please visit: https://github.com/meta-soul/MetaSpore/tree/main/demo/multimodal/online Some images source: https://github.com/openai/CLIP/raw/main/CLIP.png https://www.sbert.net/examples/training/sts/README.html

[D] Last Week in Medical AI: Top Research Papers/Models 🏅(September 21 - September 27, 2024)
reddit
LLM Vibe Score0
Human Vibe Score1
aadityauraThis week

[D] Last Week in Medical AI: Top Research Papers/Models 🏅(September 21 - September 27, 2024)

Last Week in Medical AI: Top Research Papers\/Models 🏅\(September 21 - September 27, 2024\) Medical AI Paper of the Week A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? This paper presents o1, a Large Language Model (LLM) evaluated across 37 medical datasets demonstrating superior performance in clinical understanding, reasoning, and multilinguality compared to GPT-4 and GPT-3.5. Medical LLM & Other Models: DREAMS: Python Framework for Medical LLMs A comprehensive deep learning framework for EEG data processing, model training, and report generation. SLaVA-CXR: A Small Language and Vision Assistant for Chest X-Ray Report Automation This paper introduces SLaVA-CXR, an innovative small-scale model designed for automating chest X-ray reports with high accuracy and efficiency. O1 in Medicine: AI Doctor Potential Genome Language Model : Opportunities & Challenge It highlights key gLM applications like functional constraint prediction, sequence design, and transfer learning, while discussing challenges in developing effective gLMs for complex genomes. Medical LLMs & Benchmarks: MEDICONFUSION: Probing Medical LLM Reliability This paper introduces MediConfusion, a challenging benchmark for probing the failure modes of multimodal large language models (MLLMs) in medical imaging. CHBench: Chinese LLM Health Evaluation This paper introduces CHBench, the first comprehensive Chinese health-related benchmark designed to evaluate large language models (LLMs) on their understanding of physical and mental health. LLMs for Mental Illness Evaluation PALLM: Evaluating Palliative Care LLMs Protein LMs: Scaling Necessity? Frameworks and Methodologies: Digital Twin for Oncology Operations Enhancing Guardrails for Healthcare AI InterMind: LLM-Powered Depression Assessment Conversational Health Agents: LLM Framework Medical LLM Applications: LLMs for Mental Health Severity Prediction Fine-tuning LLMs for Radiology Reports LLMs in Patient Education: Back Pain Boosting Healthcare LLMs with Retrieved Context Continuous Pretraining for Clinical LLMs AI in Healthcare Ethics: Confidence Intervals in Medical Imaging AI Generative AI Readiness for Clinical Use ... Check the full thread in detail: https://x.com/OpenlifesciAI/status/1840020394880667937 Thank you for reading! If you know of any interesting papers that were missed, feel free to share them in the comments. If you have insights or breakthroughs in Medical AI you'd like to share in next week's edition, connect with us on Twt/x: OpenlifesciAI

[R] OS-Copilot: Towards Generalist Computer Agents with Self-Improvement - Shanghai AI Laboratory 2024
reddit
LLM Vibe Score0
Human Vibe Score1
Singularian2501This week

[R] OS-Copilot: Towards Generalist Computer Agents with Self-Improvement - Shanghai AI Laboratory 2024

Paper: https://arxiv.org/abs/2402.07456 Github: https://github.com/OS-Copilot/FRIDAY Abstract: Autonomous interaction with the computer has been a longstanding challenge with great potential, and the recent proliferation of large language models (LLMs) has markedly accelerated progress in building digital agents. However, most of these agents are designed to interact with a narrow domain, such as a specific software or website. This narrow focus constrains their applicability for general computer tasks. To this end, we introduce OS-Copilot, a framework to build generalist agents capable of interfacing with comprehensive elements in an operating system (OS), including the web, code terminals, files, multimedia, and various third-party applications. We use OS-Copilot to create FRIDAY, a self-improving embodied agent for automating general computer tasks. On GAIA, a general AI assistants benchmark, FRIDAY outperforms previous methods by 35%, showcasing strong generalization to unseen applications via accumulated skills from previous tasks. We also present numerical and quantitative evidence that FRIDAY learns to control and self-improve on Excel and Powerpoint with minimal supervision. Our OS-Copilot framework and empirical findings provide infrastructure and insights for future research toward more capable and general-purpose computer agents. https://preview.redd.it/uzec8udohdic1.jpg?width=1655&format=pjpg&auto=webp&s=893b5561ca47c26c789b69925efdc26e5b783007 https://preview.redd.it/vfwfwudohdic1.jpg?width=1653&format=pjpg&auto=webp&s=9eafc2a5ea0ad188a156d3de446508d82d9cc913 https://preview.redd.it/lmi8rwdohdic1.jpg?width=1123&format=pjpg&auto=webp&s=dbc67b27585b980d0c592f9bd9f87f3ec6531f66 https://preview.redd.it/20yo21eohdic1.jpg?width=1037&format=pjpg&auto=webp&s=72fab36d585b862eed4ff6c7deed2be0cd62f637

[D] Last Week in Medical AI: Top LLM Research Papers/Models (December 7 - December 14, 2024)
reddit
LLM Vibe Score0
Human Vibe Score0
aadityauraThis week

[D] Last Week in Medical AI: Top LLM Research Papers/Models (December 7 - December 14, 2024)

[\[D\] Last Week in Medical AI: Top LLM Research Papers\/Models \(December 7 - December 14, 2024\)](https://preview.redd.it/o23fp3csj07e1.jpg?width=1280&format=pjpg&auto=webp&s=69e19fc351b3aa5e34c4c00e66245583f88bd9bb) Medical LLM & Other Models PediaBench: Chinese Pediatric LLM This paper introduces PediaBench, the first Chinese pediatric dataset for evaluating Large Language Model (LLM) question-answering performance, containing 4,565 objective and 1,632 subjective questions across 12 disease groups. BiMediX: Bilingual Medical LLM This paper introduces BiMediX, the first bilingual (English-Arabic) medical Mixture of Experts LLM, along with BiMed1.3M, a 1.3M bilingual medical instruction dataset with over 632M tokens used for training. Diverse medical knowledge integration This paper introduces BiMediX2, a bilingual (Arabic-English) Large Multimodal Model (LMM) based on Llama3.1 architecture, trained on 1.6M medical interaction samples. BRAD: Digital Biology Language Model This paper introduces BRAD (Bioinformatics Retrieval Augmented Digital assistant), an LLM-powered chatbot and agent system integrating various bioinformatics tools. MMedPO: Vision-Language Medical LLM This paper introduces MMedPO, a multimodal medical preference optimization approach to improve factual accuracy in Medical Large Vision-Language Models (Med-LVLMs) by addressing modality misalignment. Frameworks & Methodologies \- TOP-Training: Medical Q&A Framework \- Hybrid RAG: Secure Medical Data Management \- Zero-Shot ATC Clinical Coding \- Chest X-Ray Diagnosis Architecture \- Medical Imaging AI Democratization Benchmarks & Evaluations \- KorMedMCQA: Korean Healthcare Licensing Benchmark \- Large Language Model Medical Tasks \- Clinical T5 Model Performance Study \- Radiology Report Quality Assessment \- Genomic Analysis Benchmarking LLM Applications \- TCM-FTP: Herbal Prescription Prediction \- LLaSA: Activity Analysis via Sensors \- Emergency Department Visit Predictions \- Neurodegenerative Disease AI Diagnosis \- Kidney Disease Explainable AI Model Ethical AI & Privacy \- Privacy-Preserving LLM Mechanisms \- AI-Driven Digital Organism Modeling \- Biomedical Research Automation \- Multimodality in Medical Practice Full thread in detail: https://x.com/OpenlifesciAI/status/1867999825721242101

[R] TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs - Yaobo Liang et al Microsoft 2023
reddit
LLM Vibe Score0
Human Vibe Score1
Singularian2501This week

[R] TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs - Yaobo Liang et al Microsoft 2023

Paper: https://arxiv.org/abs/2303.16434 Abstract: Artificial Intelligence (AI) has made incredible progress recently. On the one hand, advanced foundation models like ChatGPT can offer powerful conversation, in-context learning and code generation abilities on a broad range of open-domain tasks. They can also generate high-level solution outlines for domain-specific tasks based on the common sense knowledge they have acquired. However, they still face difficulties with some specialized tasks because they lack enough domain specific data during pre-training or they often have errors in their neural network computations on those tasks that need accurate executions. On the other hand, there are also many existing models and systems (symbolic-based or neural-based) that can do some domain specific tasks very well. However, due to the different implementation or working mechanisms, they are not easily accessible or compatible with foundation models. Therefore, there is a clear and pressing need for a mechanism that can leverage foundation models to propose task solution outlines and then automatically match some of the sub tasks in the outlines to the off-the-shelf models and systems with special functionalities to complete them. Inspired by this, we introduce TaskMatrix.AI as a new AI ecosystem that connects foundation models with millions of APIs for task completion. Unlike most previous work that aimed to improve a single AI model, TaskMatrix.AI focuses more on using existing foundation models (as a brain-like central system) and APIs of other AI models and systems (as sub-task solvers) to achieve diversified tasks in both digital and physical domains. As a position paper, we will present our vision of how to build such an ecosystem, explain each key component, and use study cases to illustrate both the feasibility of this vision and the main challenges we need to address next. https://preview.redd.it/0guexiznhxqa1.jpg?width=979&format=pjpg&auto=webp&s=e5d818ae789cfc493cfb82fdf8b002a8dfe11939

🌟 Introducing DarwinAI: An Open-Source Platform for the Evolution of Intelligent Agents 🚀 [Project]
reddit
LLM Vibe Score0
Human Vibe Score1
Interesting-Fox-6758This week

🌟 Introducing DarwinAI: An Open-Source Platform for the Evolution of Intelligent Agents 🚀 [Project]

🌱 The Vision: Evolutionary AI at Your Fingertips Imagine a world where AI agents aren't just programmed to perform tasks but evolve over time, adapting and improving through generations, much like living organisms. Welcome to DarwinAI, an open-source platform inspired by biological evolution, designed to breed, train, and evolve AI agents that can tackle complex, dynamic, and unpredictable challenges. 🧬 The Genetic Blueprint: Building Blocks of Intelligence At the core of DarwinAI is the concept of a digital DNA for each AI agent. This DNA is a modular structure that defines the agent's capabilities, behaviors, and adaptability. Here's what makes up this digital DNA: Genes of Ability: These are snippets of code that represent specific functions, like data classification, text analysis, or optimization. Think of them as the skills your AI agent possesses. Genes of Adaptation: These genes control how the agent responds to different environments or contexts. They determine its flexibility and resilience in the face of changing conditions. Genes of Connection: These define how the agent interacts with other agents or external resources. They are the social and collaborative aspects of the agent. This digital DNA is stored in a structured, version-controlled database, allowing us to track the evolution of each agent and ensure that beneficial mutations are preserved over time. 🛠️ The Evolutionary Process: From Genesis to Mastery The evolution of AI agents in DarwinAI happens through a series of generations, each building upon the strengths of the previous one: Selection of Parents: The fittest agents, those that excel at specific tasks, are chosen as parents. These agents have proven their worth in the simulated environment and are prime candidates for breeding the next generation. Genetic Crossover: The digital DNA of these parent agents is combined to create new agents. This can happen in two ways: Direct Crossover: Where entire genes are copied from the parents. Combinatorial Crossover: Where parts of different genes are fused to create entirely new abilities. Mutations: Random, small changes are introduced into the genes to promote diversity and explore new solutions. These mutations are the wildcards that can lead to breakthrough abilities. 🌍 The Simulated Environment: A Playground for Evolution Agents don't just exist in a vacuum; they operate in a dynamic, simulated environment where they must adapt and survive. This environment is designed to challenge the agents with: Evolutionary Tasks: Problems that agents must solve, such as data classification, prediction, or content generation. Changing Contexts: Factors like noisy data, resource constraints, or new rules that force agents to adapt on the fly. 🐣 The Life Cycle of an Agent: From Birth to Legacy Each agent goes through a life cycle that mirrors the process of natural selection: Initial Learning: Agents receive initial training based on their digital DNA. Task Execution: They perform tasks in the simulated environment, where their abilities are put to the test. Performance Evaluation: Their effectiveness, adaptability, and efficiency are measured. Reproduction: The top-performing agents produce offspring with improved genetic traits. Discard and Archive: Less effective agents are archived for future analysis, ensuring that their lessons are not lost. 🧩 Knowledge Transfer: Passing the Torch One of the key aspects of DarwinAI is the ability for agents to pass on their learned knowledge to future generations: Weight Persistence: Trained models retain their learned weights, allowing them to inherit capabilities from their ancestors. Modular Transfer: Optimized ability genes can be directly copied to new generations, ensuring that valuable skills are preserved. 🛠️ Modularity and Extensibility: Build, Mix, and Evolve DarwinAI is designed to be highly modular and extensible, allowing for: New Capabilities: Easily incorporate new genes to expand the agents' abilities over time. Hybridization: Combine agents from different specializations to create more complex and versatile agents. Directed Evolution: Introduce controlled mutations to address specific problems or challenges. 🚀 Innovative Use Cases: The Future is Bright The potential applications of DarwinAI are vast and varied: Adaptive Automation: Create agents that can adapt to new market conditions or evolving industrial requirements. Collaborative Robots: Develop robots that evolve to improve teamwork in dynamic environments. Scientific Discovery: Agents that combine skills to uncover patterns or solutions that were previously unknown. 🚀 Vision for the Future: An Ecosystem of Evolving Intelligence By fostering an ecosystem where knowledge is accumulated and adaptability is paramount, DarwinAI aims to produce agents that are not only intelligent but also diverse and efficient. These agents will be equipped to handle complex, unpredictable challenges, opening up new frontiers in AI research and application. 🌐 Join Us in Shaping the Future of AI! DarwinAI is more than just a project; it's a community-driven movement towards a new era of AI. We invite you to join us, contribute your ideas, and help shape the future of evolutionary AI. Whether you're a developer, researcher, or simply someone excited about the potential of AI, there's a place for you in this journey. Let's evolve together! 🌱💻

[R] Forecasting and Mitigating Security Threats from Malicious AI Applications
reddit
LLM Vibe Score0
Human Vibe Score1
Successful-Western27This week

[R] Forecasting and Mitigating Security Threats from Malicious AI Applications

This paper provides a systematic analysis of potential malicious applications of AI systems across digital, physical and political security domains. The methodology involves: Surveying dual-use AI capabilities that could enable attacks Mapping specific attack vectors and required technical capabilities Analyzing the evolution of attacker/defender dynamics Developing a framework for threat assessment and mitigation Key technical findings: ML advances in areas like NLP and computer vision lower barriers to sophisticated attacks Automated systems can significantly scale up traditional attack vectors Transfer learning and GANs enable rapid adaptation of attack techniques Technical countermeasures alone are insufficient - policy/governance frameworks needed The researchers provide a detailed assessment framework examining: Technical requirements for different attack types Estimated timeline for capability development Difficulty of execution and potential impact Proposed defensive measures and their limitations I think this work is important for helping the ML community get ahead of security risks before they materialize. The framework provides a structured way to evaluate emerging threats, though I expect the specific attack vectors will evolve significantly as capabilities advance. I think we need much more research on measuring the effectiveness of proposed countermeasures and understanding the co-evolution of offensive/defensive capabilities. The policy recommendations are a good start but will require ongoing refinement. TLDR: Systematic analysis of how ML advances could enable new attack vectors across security domains. Provides framework for assessing and mitigating threats through both technical and policy measures. Full summary is here. Paper here.

[Research] Towards Safety-Aware Computing System Design in Autonomous Vehicles
reddit
LLM Vibe Score0
Human Vibe Score0
cdossmanThis week

[Research] Towards Safety-Aware Computing System Design in Autonomous Vehicles

https://medium.com/ai%C2%B3-theory-practice-business/ai-scholar-building-safety-aware-computing-system-design-in-autonomous-vehicles-d7065f4239fe Abstract: Recently, autonomous driving development ignited competition among car makers and technical corporations. Low-level automation cars are already commercially available. But high automated vehicles where the vehicle drives by itself without human monitoring is still at infancy. Such autonomous vehicles (AVs) rely on the computing system in the car to interpret the environment and make driving decisions. Therefore, computing system design is essential particularly in enhancing the attainment of driving safety. However, to our knowledge, no clear guideline exists so far regarding safety-aware AV computing system and architecture design. To understand the safety requirement of AV computing system, we performed a field study by running industrial Level-4 autonomous driving fleets in various locations, road conditions, and traffic patterns. The field study indicates that traditional computing system performance metrics, such as tail latency, average latency, maximum latency, and timeout, cannot fully satisfy the safety requirement for AV computing system design. To address this issue, we propose a “safety score” as a primary metric for measuring the level of safety in AV computing system design. Furthermore, we propose a perception latency model, which helps architects estimate the safety score of given architecture and system design without physically testing them in an AV. We demonstrate the use of our safety score and latency model, by developing and evaluating a safety-aware AV computing system computation hardware resource management scheme.

Showing 841-864 of 1945 resources