VibeBuilders.ai Logo
VibeBuilders.ai

League

Explore resources related to league to help implement AI solutions for your business.

[D] Playing big league at home on a budget?
reddit
LLM Vibe Score0
Human Vibe Score0.778
ballerburg9005This week

[D] Playing big league at home on a budget?

I am a hobbyist and my Nvidia 660 is 10 years old and only has 2GB. Obviously that isn't going to cut it nowadays anymore. I am thinking about options here. I don't have thousands and thousands of dollars. And I highly doubt that spending close to a thousand dollars on a brand new card is still viable in 2020-2022. I wanted to use Wavenet today and then found out about Melnet. I mean, maybe I could run Wavenet but nobody in their right mind wants to after hearing Melnet results. On Github this one guy complained he couldn't get his implementation to work due to OOM with 2x 2080 RTX, which he bought solely for this purpose. Then on the other repo the guy casually mentioned that tier XY doesn't fit with some 10 year old lowfi dataset, even with batch size 1, on a 16GB Tesla P100. The wisdom for OOM has always been "decrease batch size". But as far as I can tell, for most of any of the interesting stuff in the last 8 years or so you simply can't decrease batch size. Either because batch sizes are already so tiny, or because the code is written in a way that would require you to somehow turn it inside out, probably involving extreme knowledge of higher mathematics. I am a hobbyist, not a researcher. I am happy if I crudely can grasp what is going on. Most of anything in the field suffers from exactly the same issue: It simply won't run without utterly absurd amounts of VRAM. So what about buying shitty cheapo AMD GPUs with lots of VRAM? This seems to be the sensible choice if you want to be able to run anything noteworthy at all that comes up in the next 2 years and maybe beyond. People say, don't but AMD its slow and it sucks, but those are apparently the same people that buy a 16GB Titan GPU for $1500 three times on Ebay without hesitation, when there are also 16GB AMD GPUs for $300. How much slower are AMD GPUs really? Let's say they are 5 times cheaper so they could be just 5 times slower. So I have to train my model over night instead of seeing the result in the afternoon. That would be totally awesome!; given that the alternative is to buy a $300 Nvidia GPU, which has maybe 4 or 6GB and simply can't run the code without running out of memory. And say $300 is not enough, let's buy a $700 RTX 3080. It still only has 10GB of VRAM not even 16GB. Then its just as useless! What's the point of buying a fast GPU if it can't even run the code? I don't know how much slower AMD GPUs really are. Maybe they are not 5x but 50x slower. Then of course training a model that was developed on some 64GB Tesla might take month and years. But maybe speed is not the issue, only memory. I have seen some stuff even being optimized for CPU, apparently because there weren't any big enough GPUs around. I don't really know how viable that can be (it seems rarely if ever it is), I have no experience. And what about renting AWS? Let's say, I am a beginner and I want to toy around for a week and probably max out 4 Teslas like 80% of the time without really getting anywhere. How expensive is that? $25, $50, $100, $500? (Found the answer: fucking $2000 https://aws.amazon.com/ec2/instance-types/p3/ ) Ok, so AWS is bullshit, here its 6x cheaper: https://vast.ai/console/create/ . They don't really have 4x 16GB V100 though, just one V100. $0.5 per hour 24 7 = $84 per month (there are more hidden cost like bandwidth, it doesn't seem to be huge but I never used this so don't take it at face value). On AWS the same is over $3 per hour. So a day is $12, this could be viable! (look at calculation below). There really isn't much info on the net about hardware requirements and performance for machine learning stuff. What bothers me the most is that people seem to be very ignorant of the VRAM issue. Either because they aren't looking ahead of what might come in 1-2 years. Or because they are simply so rich they have no issue spending thousands and thousands of dollars every year instead of just 500 every couple of years. Or maybe they are both. So, yeah, what are your thoughts? Here is what I found out just today: Until 2 years ago, tensorflow and pytorch wouldn't work with AMD cards, but this has changed. https://rocmdocs.amd.com/en/latest/Deep_learning/Deep-learning.html For older cards though, ROCm only works with certain CPUs: it needs PCIe 3.0 with atomics (see: https://github.com/RadeonOpenCompute/ROCm ). So you can't simply buy any 16GB card for $300 on Ebay like I suggested, even if it supports ROCm, because it will only work for "newer" PCs. The newer GFX9 AMD cards (like Radeon VII and Vega) don't suffer from this problem and work with PCIe 2.0 again... Although I have seen 16GB Vega cards for like $350 on Ebay, I think that is a pretty rare catch. However looking 1-2 years in the future, this is great because Radeon VII prices will be hugely inflated by Nvidia 3000 series hype (maybe down to $180 even) and maybe the next gen cards from AMD even have 24 or 32GB for $500-$1000 and can still run on old machines. According to this https://arxiv.org/pdf/1909.06842.pdf Radeon VII 16GB performs only half as good as Tesla V100 16GB, whereas V100 should be roughly along the lines of 11GB RTX 2080 Ti. So you could say that you get half the RAM, double the speed, double the price. I am not sure though if that holds. I think they were putting 16GB in those cards trying to push it for ML with ROCm, clearly addressing the problem of the time, but no one really jumped on the train and now Resnet shrinks RAM but needs more processing power. So they released 8GB cards again with slightly better performance, and I guess we are lucky if the next generation even has 16GB because games probably don't need it at all. Still though with Revnets and everything said in the comments, I think on a budget you are better on the safe side buying the card with the most amount of VRAM, rather than the most performance. Tomorrow some paper might come out that uses another method, then you can't trick-shrink your network anymore and then everyone needs to buy big ass cards again like it used to be and can do nothing but throw their fancy faster cards in the dumpster. Also the huge bulk of ML currently focuses on image processing, while sound has only been gaining real momentum recently and this will be followed by video processing and eventually human-alike thought processes that sit atop of all that and have not even been tackled yet. Its a rapidly evolving field, hard to predict what will come and stay. Running out of VRAM means total hardware failure, running slower just means waiting longer. If you just buy the newest card every year, its probably save to buy the fast card because things won't change that fast after all. If you buy a new card every 4 years or longer then just try to get as much VRAM as possible. Check this out: https://www.techspot.com/news/86811-gigabyte-accidentally-reveals-rtx-3070-16gb-rtx-3080.html There will be a 3070 16GB version! Let's compare renting one V100 at $12/day vs. buying a 3070 Ti 16GB: The 2080 Ti was 1.42x the price of the regular 2080 and released the next summer. So let's assume the same will be true to the 3070 Ti so it will cost $700. That is $30/month & $1.88/day for two years - $15/month & $0.94/day in four years (by which time you can probably rent some 32GB Tesla card for the same price and nothing recent runs on less anymore). If you max out your setup 24/7 all year, then power cost obviously becomes a huge factor to that figure. In my country running at 500W cost $4.21/day, or $1.60 / 9hrs overnight. If you live elsewhere it might be as much as a quarter of that price. Of course your PC may run 10h a day anyway, so its maybe just 300W plus, and an older graphics card is inefficient for games it eats more Watts to do the same things so you save some there as well. There is a lot to take into account if comparing. Anyway, factoring in power cost, to break even with buying the card vs. renting within two years, you would have to use it for at least 4 days a month, or almost 2 weeks every 3 month. If you use it less than that, you maybe have a nice new graphics card and less hassle with pushing stuff back and forth onto servers all the time. But it would have been more economic to rent. So renting isn't that bad after all. Overall if you are thinking about having this as your hobby, you could say that it will cost you at least $30 per month, if not $50 or more (when keeping up to date with cards every 2 instead of 4 years + using it more cost more power). I think that is quite hefty. Personally I am not even invested enough into this even if it wasn't over my finances. I want a new card of course and also play some new games, but I don't really need to. There are a lot of other (more) important things I am interested in, that are totally free.

160 of Y Combinators 229 Startup Cohort are AI Startups with and 75% of the Cohort has 0 revenue
reddit
LLM Vibe Score0
Human Vibe Score1
DemocratizingfinanceThis week

160 of Y Combinators 229 Startup Cohort are AI Startups with and 75% of the Cohort has 0 revenue

Y Combinator (YC), one of the most prestigious startup accelerators in the world, has just unveiled its latest batch of innovative startups, providing key insights into what the future might hold. Y Combinators Summer 2023 Batch In a recent post by Garry Tan, YC's president, Tan offers a nostalgic look back at his first YC Demo Day in 2008, where he, as a budding entrepreneur, pitched his startup. Now, fifteen years later, he's at the helm, proudly launching the 37th Demo Day, this time for the Summer 2023 batch. Tan proudly declares this batch as one of YC's most impressive yet, emphasizing the deep technical talent of the participants. From a staggering pool of over 24,000 applications, only 229 startups were chosen, making this one of the most competitive batches to date. This batch marks a number of firsts and solidifies several rising trends within the startups landscape. 75% of these companies began their YC journey with zero revenue, and 81% hadn't raised any funding before joining the accelerator. YC's decision to focus on early-stage startups this round signals their commitment to nurturing raw, untapped potential. A Return to Face-to-Face Interaction After three years, YC has brought back the in-person Demo Day format, allowing startups, investors, and mentors to connect directly. While the virtual format has its merits, there's an unmistakable magic in the YC Demo Day room, filled with anticipation, hope, and innovation. AI Takes Center Stage Artificial Intelligence is the standout sector in the Summer 2023 batch. With recent advancements making waves across various industries, there's arguably no better time to launch an AI-focused startup, and no better platform than YC to foster its growth. This signals a clear trend in the startup investing and venture capital space: AI is just getting started. Of the entire Summer 2023 batch, 160 out of the entire 229 Summer 2023 batch that are utilizing or implementing artificial intelligence in some capacity. This means over 2 out of every 3 startups accepted is focused on artificial intelligence in some capacity. Some of the startups include: Quill AI: Automating the job of a financial analyst Fiber AI: Automating prospecting and outbound marketing Reworkd AI: Open Source Zapier of AI Agents Watto AI: AI-powered McKinsey-quality reports in seconds Agentive: AI-powered auditing platform Humanlike: Replace your call center with voice bots that sound human Greenlite: AI compliance team for fintech and banking atla: AI assistants to help in-house lawyers answer legal questions Studdy: An AI Match tutor Glade: League of Legends with AI-generated maps and gameplay and literally over 100 others. As you can see, there's a startup covering nearly every sector of AI in the new batch. YC By The Numbers YC continues to grow as a community. The accelerator now boasts over 10,000 founders spanning more than 4,500 startups. The success stories are impressive: over 350 startups valued at over $150 million and 90 valued at more than $1 billion. The unicorn creation rate of 5% is truly unparalleled in the industry. To cater to the ever-growing community, YC has added more full-time Group Partners than ever. This includes industry veterans such as Tom Blomfield, co-founder of billion-dollar startups GoCardless and Monzo, and YC alumni like Wayne Crosby (Zenter) and Emmett Shear (Twitch). YC Core Values YC's commitment to diversity is evident in the demographics of the S23 batch. They've also spotlighted the industries these startups operate in, with 70% in B2B SaaS/Enterprise, followed by fintech, healthcare, consumer, and proptech/industrials. Garry Tan emphasizes three core tenets for YC investors: to act ethically, to make decisions swiftly, and to commit long-term. He underlines the importance of the YC community, urging investors to provide valuable introductions and guidance to founders. The Road Ahead With YC's track record and the promise shown by the Summer 2023 batch, the future of the startup ecosystem looks promising. As always, YC remains at the forefront, championing innovation and shaping the next generation of global startups. Original Post: https://www.democratizing.finance/post/take-a-peek-into-the-future-with-y-combinators-finalized-summer-2023-batch

Study Plan for Learning Data Science Over the Next 12 Months [D]
reddit
LLM Vibe Score0
Human Vibe Score1
daniel-dataThis week

Study Plan for Learning Data Science Over the Next 12 Months [D]

In this thread, I address a study plan for 2021. In case you're interested, I wrote a whole article about this topic: Study Plan for Learning Data Science Over the Next 12 Months Let me know your thoughts on this. ​ https://preview.redd.it/emg20nzhet661.png?width=1170&format=png&auto=webp&s=cf09e4dc5e82ba2fd7b57c706ba2873be57fe8de We are ending 2020 and it is time to make plans for next year, and one of the most important plans and questions we must ask is what do we want to study?, what do we want to enhance?, what changes do we want to make?, and what is the direction we are going to take (or continue) in our professional careers?. Many of you will be starting on the road to becoming a data scientist, in fact you may be evaluating it, since you have heard a lot about it, but you have some doubts, for example about the amount of job offers that may exist in this area, doubts about the technology itself, and about the path you should follow, considering the wide range of options to learn. I’m a believer that we should learn from various sources, from various mentors, and from various formats. By sources I mean the various virtual platforms and face-to-face options that exist to study. By mentors I mean that it is always a good idea to learn from different points of view and learning from different teachers/mentors, and by formats I mean the choices between books, videos, classes, and other formats where the information is contained. When we extract information from all these sources we reinforce the knowledge learned, but we always need a guide, and this post aims to give you some practical insights and strategies in this regard. To decide on sources, mentors and formats it is up to you to choose. It depends on your preferences and ease of learning: for example, some people are better at learning from books, while others prefer to learn from videos. Some prefer to study on platforms that are practical (following online code), and others prefer traditional platforms: like those at universities (Master’s Degree, PHDs or MOOCs). Others prefer to pay for quality content, while others prefer to look only for free material. That’s why I won’t give a specific recommendation in this post, but I’ll give you the whole picture: a study plan. To start you should consider the time you’ll spend studying and the depth of learning you want to achieve, because if you find yourself without a job you could be available full time to study, which is a huge advantage. On the other hand, if you are working, you’ll have less time and you’ll have to discipline yourself to be able to have the time available in the evenings, mornings or weekends. Ultimately, the important thing is to meet the goal of learning and perhaps dedicating your career to this exciting area! We will divide the year into quarters as follows First Quarter: Learning the Basics Second Quarter: Upgrading the Level: Intermediate Knowledge Third Quarter: A Real World Project — A Full-stack Project Fourth Quarter: Seeking Opportunities While Maintaining Practice First Quarter: Learning the Basics ​ https://preview.redd.it/u7t9bthket661.png?width=998&format=png&auto=webp&s=4ad29cb43618e7acf793259243aa5a60a8535f0a If you want to be more rigorous you can have start and end dates for this period of study of the bases. It could be something like: From January 1 to March 30, 2021 as deadline. During this period you will study the following: A programming language that you can apply to data science: Python or R. We recommend Python due to the simple fact that approximately 80% of data science job offers ask for knowledge in Python. That same percentage is maintained with respect to the real projects you will find implemented in production. And we add the fact that Python is multipurpose, so you won’t “waste” your time if at some point you decide to focus on web development, for example, or desktop development. This would be the first topic to study in the first months of the year. Familiarize yourself with statistics and mathematics. There is a big debate in the data science community about whether we need this foundation or not. I will write a post later on about this, but the reality is that you DO need it, but ONLY the basics (at least in the beginning). And I want to clarify this point before continuing. We could say that data science is divided in two big fields: Research on one side and putting Machine Learning algorithms into production on the other side. If you later decide to focus on Research then you are going to need mathematics and statistics in depth (very in depth). If you are going to go for the practical part, the libraries will help you deal with most of it, under the hood. It should be noted that most job offers are in the practical part. For both cases, and in this first stage you will only need the basics of: Statistics (with Python and NumPy) Descriptive statistics Inferential Statistics Hypothesis testing Probability Mathematics (with Python and NumPy) Linear Algebra (For example: SVD) Multivariate Calculus Calculus (For example: gradient descent) Note: We recommend that you study Python first before seeing statistics and mathematics, because the challenge is to implement these statistical and mathematical bases with Python. Don’t look for theoretical tutorials that show only slides or statistical and/or mathematical examples in Excel/Matlab/Octave/SAS and other different to Python or R, it gets very boring and impractical! You should choose a course, program or book that teaches these concepts in a practical way and using Python. Remember that Python is what we finally use, so you need to choose well. This advice is key so you don’t give up on this part, as it will be the most dense and difficult. If you have these basics in the first three months, you will be ready to make a leap in your learning for the next three months. Second Quarter: Upgrading the Level: Intermediate Knowledge ​ https://preview.redd.it/y1y55vynet661.png?width=669&format=png&auto=webp&s=bd3e12bb112943025c39a8975faf4d64514df275 If you want to be more rigorous you can have start and end dates for this period of study at the intermediate level. It could be something like: From April 1 to June 30, 2021 as deadline. Now that you have a good foundation in programming, statistics and mathematics, it is time to move forward and learn about the great advantages that Python has for applying data analysis. For this stage you will be focused on: Data science Python stack Python has the following libraries that you should study, know and practice at this stage Pandas: for working with tabular data and make in-depth analysis Matplotlib and Seaborn: for data visualization Pandas is the in-facto library for data analysis, it is one of the most important (if not the most important) and powerful tools you should know and master during your career as a data scientist. Pandas will make it much easier for you to manipulate, cleanse and organize your data. Feature Engineering Many times people don’t go deep into Feature Engineering, but if you want to have Machine Learning models that make good predictions and improve your scores, spending some time on this subject is invaluable! Feature engineering is the process of using domain knowledge to extract features from raw data using data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself. To achieve the goal of good feature engineering you must know the different techniques that exist, so it is a good idea to at least study the main ones. Basic Models of Machine Learning At the end of this stage you will start with the study of Machine Learning. This is perhaps the most awaited moment! This is where you start to learn about the different algorithms you can use, which particular problems you can solve and how you can apply them in real life. The Python library we recommend you to start experimenting with ML is: scikit-learn. However it is a good idea that you can find tutorials where they explain the implementation of the algorithms (at least the simplest ones) from scratch with Python, since the library could be a “Black Box” and you might not understand what is happening under the hood. If you learn how to implement them with Python, you can have a more solid foundation. If you implement the algorithms with Python (without a library), you will put into practice everything seen in the statistics, mathematics and Pandas part. These are some recommendations of the algorithms that you should at least know in this initial stage Supervised learning Simple Linear Regression Multiple Linear Regression K-nearest neighbors (KNN) Logistic Regression Decision Trees Random Forest Unsupervised Learning K-Means PCA Bonus: if you have the time and you are within the time ranges, you can study these others Gradient Boosting Algorithms GBM XGBoost LightGBM CatBoost Note: do not spend more than the 3 months stipulated for this stage. Because you will be falling behind and not complying with the study plan. We all have shortcomings at this stage, it is normal, go ahead and then you can resume some concepts that did not understand in detail. The important thing is to have the basic knowledge and move forward! If at least you succeed to study the mentioned algorithms of supervised and unsupervised learning, you will have a very clear idea of what you will be able to do in the future. So don’t worry about covering everything, remember that it is a process, and ideally you should have some clearly established times so that you don’t get frustrated and feel you are advancing. So far, here comes your “theoretical” study of the basics of data science. Now we’ll continue with the practical part! Third Quarter: A Real World Project — A Full-stack Project ​ https://preview.redd.it/vrn783vqet661.png?width=678&format=png&auto=webp&s=664061b3d33b34979b74b10b9f8a3d0f7b8b99ee If you want to be more rigorous you can have start and end dates for this period of study at the intermediate level. It could be something like: From July 1 to September 30, 2021 as deadline. Now that you have a good foundation in programming, statistics, mathematics, data analysis and machine learning algorithms, it is time to move forward and put into practice all this knowledge. Many of these suggestions may sound out of the box, but believe me they will make a big difference in your career as a data scientist. The first thing is to create your web presence: Create a Github (or GitLab) account, and learn Git*. Being able to manage different versions of your code is important, you should have version control over them, not to mention that having an active Github account is very valuable in demonstrating your true skills. On Github, you can also set up your Jupyter Notebooks and make them public, so you can show off your skills as well. This is mine for example: https://github.com/danielmoralesp Learn the basics of web programming*. The advantage is that you already have Python as a skill, so you can learn Flask to create a simple web page. Or you can use a template engine like Github Pages, Ghost or Wordpress itself and create your online portfolio. Buy a domain with your name*. Something like myname.com, myname.co, myname.dev, etc. This is invaluable so you can have your CV online and update it with your projects. There you can make a big difference, showing your projects, your Jupyter Notebooks and showing that you have the practical skills to execute projects in this area. There are many front-end templates for you to purchase for free or for payment, and give it a more personalized and pleasant look. Don’t use free sub-domains of Wordpress, Github or Wix, it looks very unprofessional, make your own. Here is mine for example: https://www.danielmorales.dev/ Choose a project you are passionate about and create a Machine Learning model around it. The final goal of this third quarter is to create ONE project, that you are passionate about, and that is UNIQUE among others. It turns out that there are many typical projects in the community, such as predicting the Titanic Survivors, or predicting the price of Houses in Boston. Those kinds of projects are good for learning, but not for showing off as your UNIQUE projects. If you are passionate about sports, try predicting the soccer results of your local league. If you are passionate about finance, try predicting your country’s stock market prices. If you are passionate about marketing, try to find someone who has an e-commerce and implement a product recommendation algorithm and upload it to production. If you are passionate about business: make a predictor of the best business ideas for 2021 :) As you can see, you are limited by your passions and your imagination. In fact, those are the two keys for you to do this project: Passion and Imagination. However don’t expect to make money from it, you are in a learning stage, you need that algorithm to be deployed in production, make an API in Flask with it, and explain in your website how you did it and how people can access it. This is the moment to shine, and at the same time it’s the moment of the greatest learning. You will most likely face obstacles, if your algorithm gives 60% of Accuracy after a huge optimization effort, it doesn’t matter, finish the whole process, deploy it to production, try to get a friend or family member to use it, and that will be the goal achieved for this stage: Make a Full-stack Machine Learning project. By full-stack I mean that you did all the following steps: You got the data from somewhere (scrapping, open data or API) You did a data analysis You cleaned and transformed the data You created Machine Learning Models You deployed the best model to production for other people to use. This does not mean that this whole process is what you will always do in your daily job, but it does mean that you will know every part of the pipeline that is needed for a data science project for a company. You will have a unique perspective! Fourth Quarter: Seeking Opportunities While Maintaining Practice ​ https://preview.redd.it/qd0osystet661.png?width=1056&format=png&auto=webp&s=2da456b15985b2793041256f5e45bca99a23b51a If you want to be more rigorous you can have start and end dates for this period of study at the final level. It could be something like: From October 1 to December 31, 2021 as deadline. Now you have theoretical and practical knowledge. You have implemented a model in production. The next step depends on you and your personality. Let’s say you are an entrepreneur, and you have the vision to create something new from something you discovered or saw an opportunity to do business with this discipline, so it’s time to start planning how to do it. If that’s the case, obviously this post won’t cover that process, but you should know what the steps might be (or start figuring them out). But if you are one of those who want to get a job as a data scientist, here is my advice. Getting a job as a data scientist “You’re not going to get a job as fast as you think, if you keep thinking the same way”.Author It turns out that all people who start out as data scientists imagine themselves working for the big companies in their country or region. Or even remote. It turns out that if you aspire to work for a large company like data scientist you will be frustrated by the years of experience they ask for (3 or more years) and the skills they request. Large companies don’t hire Juniors (or very few do), precisely because they are already large companies. They have the financial muscle to demand experience and skills and can pay a commensurate salary (although this is not always the case). The point is that if you focus there you’re going to get frustrated! Here we must return to the following advise: “You need creativity to get a job in data science”. Like everything else in life we have to start at different steps, in this case, from the beginning. Here are the scenarios If you are working in a company and in a non-engineering role you must demonstrate your new skills to the company you are working for*. If you are working in the customer service area, you should apply it to your work, and do for example, detailed analysis of your calls, conversion rates, store data and make predictions about it! If you can have data from your colleagues, you could try to predict their sales! This may sound funny, but it’s about how creatively you can apply data science to your current work and how to show your bosses how valuable it is and EVANGELIZE them about the benefits of implementation. You’ll be noticed and they could certainly create a new data related department or job. And you already have the knowledge and experience. The key word here is Evangelize. Many companies and entrepreneurs are just beginning to see the power of this discipline, and it is your task to nurture that reality. If you are working in an area related to engineering, but that is not data science*. Here the same applies as the previous example, but you have some advantages, and that is that you could access the company’s data, and you could use it for the benefit of the company, making analyses and/or predictions about it, and again EVANGELIZING your bosses your new skills and the benefits of data science. If you are unemployed (or do not want, or do not feel comfortable following the two examples above)*, you can start looking outside, and what I recommend is that you look for technology companies and / or startups where they are just forming the first teams and are paying some salary, or even have options shares of the company. Obviously here the salaries will not be exorbitant, and the working hours could be longer, but remember that you are in the learning and practice stage (just in the first step), so you can not demand too much, you must land your expectations and fit that reality, and stop pretending to be paid $ 10,000 a month at this stage. But, depending of your country $1.000 USD could be something very interesting to start this new career. Remember, you are a Junior at this stage. The conclusion is: don’t waste your time looking at and/or applying to offers from big companies, because you will get frustrated. Be creative, and look for opportunities in smaller or newly created companies. Learning never stops While you are in that process of looking for a job or an opportunity, which could take half of your time (50% looking for opportunities, 50% staying in practice), you have to keep learning, you should advance to concepts such as Deep Learning, Data Engineer or other topics that you feel were left loose from the past stages or focus on the topics that you are passionate about within this group of disciplines in data science. At the same time you can choose a second project, and spend some time running it from end-to-end, and thus increase your portfolio and your experience. If this is the case, try to find a completely different project: if the first one was done with Machine Learning, let this second one be done with Deep learning. If the first one was deployed to a web page, that this second one is deployed to a mobile platform. Remember, creativity is the key! Conclusion We are at an ideal time to plan for 2021, and if this is the path you want to take, start looking for the platforms and media you want to study on. Get to work and don’t miss this opportunity to become a data scientist in 2021! Note: we are building a private community in Slack of data scientist, if you want to join us write to the email: support@datasource.ai I hope you enjoyed this reading! you can follow me on twitter or linkedin Thank you for reading!

An Algorithm for Making Truly Stand-Out Advertising Content (+ something more | Part 1)
reddit
LLM Vibe Score0
Human Vibe Score1
asealey1This week

An Algorithm for Making Truly Stand-Out Advertising Content (+ something more | Part 1)

Hi everyone. my friend and I are software engineers and new to marketing. A few months ago we decided to leverage our software skills for a colleague in ecommerce. It started by implementing a Flux.1 model, then began using texture-based recreations with a canny mask, and then found that we could optimize on both with an added layer of inpainting...and the list goes on. This is the first of a series of posts here about it and I look forward to learning from your feedback. I realized that the most difficult parts of the marketing process when I started out (and most likely for other beginners too) are: Customer Acquisition Costs / Brand Differentiation: Competition is intensifying and it is getting more difficult to stand out in crowded markets and target ad spend more effectively. Maintaining Authenticity at Scale / Data Overload: Balancing growth with authenticity and leveraging available data to successfully engage with customers is a big ask. Creative Fatigue: Maintaining multiple marketing channels in hard, and it becomes harder when you're constantly demanding more and more creative content for campaigns. For 1) I tried using AI to help me summarize, systematize, and gain insights from the information available for a given brand or product (from a page link, prompt, input image, etc.). I know AI is everywhere now, many people are using it unnecessarily and many people are skeptical about it. However, I know from experience, that it is quite helpful in gaining insights/summarizing large amounts of data, and helping people make sense of the creative content, strategy, campaign, etc., that should be created. For 2) By leveraging reviews, forums, and other relevant brand information, AI is able to maintain the story that your brand currently tells, and enhance it based on how your customer base. For 3) Faster results means less creative fatigue- this translates to an easier time managing omnichannel marketing efforts and scaling advertising. If you're interested, please have a look at the result at madsimpleads.com You’ll need to log in to access the solution, and I'll add credits to your account to try it out! (we want to prevent from random people or bots using it because I'm paying to multiple providers for model access). DM me here or drop me a line at austin@madsimpleads.com if you need more. Thank you so much, I'll be happy to get your thoughts I hope the website will help with your advertising, please reach out if you like what I do and want to support the project! Disclaimers: the website looks a bit rough in terms of UI/UX, but we tried focusing on the functionality first available on mobile, works better on desktop I hope this doesn't come across as trying to advertise for my business or breaking any of the community rules. genuinely looking for feedback. Thank you

Solopreneur making $40k MRR with a No Code SaaS sideproject
reddit
LLM Vibe Score0
Human Vibe Score1
bts_23This week

Solopreneur making $40k MRR with a No Code SaaS sideproject

Hey, I'm Elias and I do case studies analyzing successful startups and solopreneurs. I wanted to share the summarized version of this one with you because this entrepreneurial journey blew my mind. This post will be about FormulaBot (ExcelFormulaBot), an AI No Code SaaS founded by David Bressler back in August 2022. FormulaBot is currently making $40k MRR (monthly recurring revenue). How did the founder come up with the idea. David is a data guy who worked in analytics for several years. In July 2022, David got really interested in AI, especially ChatGPT. One night, he tried it out at home, just like we all did back in the time. But in his case, trying ChatGPT gave him a big idea. That idea ended up making him a lot of money and changing the life of 750 million people who use Excel. That night David started by asking GPT easy questions, then complex ones. Since he used Excel a lot and helped his colleagues with it, he thought about an AI that could make Excel easier, like generating formulas from text. He looked online but found nothing. Seeing a big chance, he decided to do something about it. What challenges did the founder face. But David didn’t have any idea about how to develop an app. However, with no-code tools this is not a problem anymore. He discovered Bubble, a no-code web app tool that could connect with the OpenAI API.After, learning Bubble from YouTube tutorials and through trial and error and spending his nights studying the OpenAI API documentation, he launched the first version of the app in around three weeks. Strategies that made the project successful. David validated his idea by posting about ExcelFormulaBot on a Reddit Excel subreddit, receiving surprising attention with 10,000 upvotes. This encouraged him to offer the tool for free to gather feedback. Facing a hefty $4,999 API bill after the Reddit post, David quickly monetized his product with a subscription-based SaaS website. On launch day, 82 customers signed up, surpassing his expectations. A successful Product Hunt launch followed, generating $2.4k in sales within 24 hours, and a TikTok influencer with 4.5 million followers brought in thousands of new users overnight with a viral video. Marketing approach: -Paid ads: FormulaBot boosted website traffic with Paid Ads, notably on Google Ads, prioritizing Quality Score. This ensured ads aligned better with user searches, maximizing visibility and cost-efficiency, targeting those seeking Excel formula assistance. -SEO: a) Content/Keyword optimization: FormulaBot improved its SEO by making helpful pages about Excel formulas, like guides on topics such as "How to use SUMIFS." b) Site Speed Enhancement: David boosted FormulaBot's marketing site speed by moving it from Bubble to Framer, aiming to improve user experience and SEO performance. c) On-page optimization: David optimized FormulaBot's on-page elements by adjusting title tags, meta descriptions, and content to enhance SEO performance and align with search intent. These strategic refinements aimed to address ranking declines and emphasize FormulaBot's uniqueness, ultimately improving its visibility and competitiveness in search results. -Virality: FormulaBot went viral as users found it highly useful and cool. Influencers on platforms like TikTok and Twitter shared it with their followers because they found it valuable. Offering numerous free features further enhanced its appeal. Lessons: successes and mistakes. ✅ Leverage industry expertise: David identified a problem in analytics and used his experience to start an online business addressing it, turning an industry challenge into a profitable venture. ✅ Embrace learning new skills: Despite lacking initial technical know-how, David learned what he needed to develop the software himself, demonstrating a commitment to continuous learning and adaptability crucial for success. ❌ Minimize dependency on third parties: Relying solely on the ChatGPT API poses risks for FormulaBot. Any issues with the API could disrupt functionality and limit scalability. ⁉️ Caution with free tools: Offering a free tool can attract users and drive viral growth, but converting them to paying customers is challenging. Avoid relying solely on a 100% free model unless your revenue comes from non-user sources like ads. For businesses dependent on user subscriptions or purchases, balancing user attraction with conversion challenges is crucial. How could you replicate this idea step-by-step. To replicate the success of FormulaBot and similar AI wrapper startups, it's crucial to tread carefully in a competitive market. Avoid mere replication of existing solutions unless you can offer something distinct or superior. Consider these steps to effectively develop an AI Wrapper/ChatGPT wrapper product using Bubble as a no-code tool: Design the user interface: Utilize Bubble's drag-and-drop editor to create a user-friendly interface with input fields, buttons, and result displays. Set up workflows: Define workflows to connect the interface with the ChatGPT API, enabling seamless interaction between users and the AI. Integrate the ChatGPT API: Obtain the API key from OpenAI and integrate it into your app using Bubble's API connector feature. Test and gather feedback: Thoroughly test your app, soliciting feedback to refine functionality and usability. Refine and optimize: Continuously improve your app based on user input and testing results to enhance performance and user experience. The in-depth version of the case study was originally posted here. Feel free to comment if you have any questions, and let me know which similar ideas you'd like me to analyze.

[Discussion] When ML and Data Science are the death of a good company: A cautionary tale.
reddit
LLM Vibe Score0
Human Vibe Score0.6
AlexSnakeKingThis week

[Discussion] When ML and Data Science are the death of a good company: A cautionary tale.

TD;LR: At Company A, Team X does advanced analytics using on-prem ERP tools and older programming languages. Their tools work very well and are designed based on very deep business and domain expertise. Team Y is a new and ambitious Data Science team that thinks they can replace Team X's tools with a bunch of R scripts and a custom built ML platform. Their models are simplistic, but more "fashionable" compared to the econometric models used by Team X, and team Y benefits from the ML/DS moniker so leadership is allowing Team Y to start a large scale overhaul of the analytics platform in question. Team Y doesn't have the experience for such a larger scale transformation, and is refusing to collaborate with team X. This project is very likely going to fail, and cause serious harm to the company as a whole financially and from a people perspective. I argue that this is not just because of bad leadership, but also because of various trends and mindsets in the DS community at large. Update (Jump to below the line for the original story): Several people in the comments are pointing out that this just a management failure, not something due to ML/DS, and that you can replace DS with any buzz tech and the story will still be relevant. My response: Of course, any failure at an organization level is ultimately a management failure one way or the other. Moreover, it is also the case that ML/DS when done correctly, will always improve a company's bottom line. There is no scenario where the proper ML solution, delivered at a reasonable cost and in a timely fashion, will somehow hurt the company's bottom line. My point is that in this case management is failing because of certain trends and practices that are specific to the ML/DS community, namely: The idea that DS teams should operate independently of tech and business orgs -- too much autonomy for DS teams The disregard for domain knowledge that seems prevalent nowadays thanks to the ML hype, that DS can be generalists and someone with good enough ML chops can solve any business problem. That wasn't the case when I first left academia for the industry in 2009 (back then nobody would even bother with a phone screen if you didn't have the right domain knowledge). Over reliance on resources who check all the ML hype related boxes (knows Python, R, Tensorflow, Shiny, etc..., has the right Coursera certifications, has blogged on the topic, etc...), but are lacking in depth of experience. DS interviews nowadays all seem to be: Can you tell me what a p-value is? What is elastic net regression? Show me how to fit a model in sklearn? How do you impute NAs in an R dataframe? Any smart person can look those up on Stackoverflow or Cross-Validated,.....Instead teams should be asking stuff like: why does portfolio optimization use QP not LP? How does a forecast influence a customer service level? When should a recommendation engine be content based and when should it use collaborative filtering? etc... (This is a true story, happening to the company I currently work for. Names, domains, algorithms, and roles have been shuffled around to protect my anonymity)  Company A has been around for several decades. It is not the biggest name in its domain, but it is a well respected one. Risk analysis and portfolio optimization have been a core of Company A's business since the 90s. They have a large team of 30 or so analysts who perform those tasks on a daily basis. These analysts use ERP solutions implemented for them by one the big ERP companies (SAP, Teradata, Oracle, JD Edwards,...) or one of the major tech consulting companies (Deloitte, Accenture, PWC, Capgemini, etc...) in collaboration with their own in house engineering team. The tools used are embarrassingly old school: Classic RDBMS running on on-prem servers or maybe even on mainframes, code written in COBOL, Fortran, weird proprietary stuff like ABAP or SPSS.....you get the picture. But the models and analytic functions were pretty sophisticated, and surprisingly cutting edge compared to the published academic literature. Most of all, they fit well with the company's enterprise ecosystem, and were honed based on years of deep domain knowledge.  They have a tech team of several engineers (poached from the aforementioned software and consulting companies) and product managers (who came from the experienced pools of analysts and managers who use the software, or poached from business rivals) maintaining and running this software. Their technology might be old school, but collectively, they know the domain and the company's overall architecture very, very well. They've guided the company through several large scale upgrades and migrations and they have a track record of delivering on time, without too much overhead. The few times they've stumbled, they knew how to pick themselves up very quickly. In fact within their industry niche, they have a reputation for their expertise, and have very good relations with the various vendors they've had to deal with. They were the launching pad of several successful ERP consulting careers.  Interestingly, despite dealing on a daily basis with statistical modeling and optimization algorithms, none of the analysts, engineers, or product managers involved describe themselves as data scientists or machine learning experts. It is mostly a cultural thing: Their expertise predates the Data Science/ML hype that started circa 2010, and they got most of their chops using proprietary enterprise tools instead of the open source tools popular nowadays. A few of them have formal statistical training, but most of them came from engineering or domain backgrounds and learned stats on the fly while doing their job. Call this team "Team X".  Sometime around the mid 2010s, Company A started having some serious anxiety issues: Although still doing very well for a company its size, overall economic and demographic trends were shrinking its customer base, and a couple of so called disruptors came up with a new app and business model that started seriously eating into their revenue. A suitable reaction to appease shareholders and Wall Street was necessary. The company already had a decent website and a pretty snazzy app, what more could be done? Leadership decided that it was high time that AI and ML become a core part of the company's business. An ambitious Manager, with no science or engineering background, but who had very briefly toyed with a recommender system a couple of years back, was chosen to build a data science team, call it team "Y" (he had a bachelor's in history from the local state college and worked for several years in the company's marketing org). Team "Y" consists mostly of internal hires who decided they wanted to be data scientists and completed a Coursera certification or a Galvanize boot camp, before being brought on to the team, along with a few of fresh Ph.D or M.Sc holders who didn't like academia and wanted to try their hand at an industry role. All of them were very bright people, they could write great Medium blog posts and give inspiring TED talks, but collectively they had very little real world industry experience. As is the fashion nowadays, this group was made part of a data science org that reported directly to the CEO and Board, bypassing the CIO and any tech or business VPs, since Company A wanted to claim the monikers "data driven" and "AI powered" in their upcoming shareholder meetings. In 3 or 4 years of existence, team Y produced a few Python and R scripts. Their architectural experience  consisted almost entirely in connecting Flask to S3 buckets or Redshift tables, with a couple of the more resourceful ones learning how to plug their models into Tableau or how to spin up a Kuberneties pod.  But they needn't worry: The aforementioned manager, who was now a director (and was also doing an online Masters to make up for his qualifications gap and bolster his chances of becoming VP soon - at least he now understands what L1 regularization is), was a master at playing corporate politics and self-promotion. No matter how few actionable insights team Y produced or how little code they deployed to production, he always had their back and made sure they had ample funding. In fact he now had grandiose plans for setting up an all-purpose machine learning platform that can be used to solve all of the company's data problems.  A couple of sharp minded members of team Y, upon googling their industry name along with the word "data science", realized that risk analysis was a prime candidate for being solved with Bayesian models, and there was already a nifty R package for doing just that, whose tutorial they went through on R-Bloggers.com. One of them had even submitted a Bayesian classifier Kernel for a competition on Kaggle (he was 203rd on the leaderboard), and was eager to put his new-found expertise to use on a real world problem. They pitched the idea to their director, who saw a perfect use case for his upcoming ML platform. They started work on it immediately, without bothering to check whether anybody at Company A was already doing risk analysis. Since their org was independent, they didn't really need to check with anybody else before they got funding for their initiative. Although it was basically a Naive Bayes classifier, the term ML was added to the project tile, to impress the board.  As they progressed with their work however, tensions started to build. They had asked the data warehousing and CA analytics teams to build pipelines for them, and word eventually got out to team X about their project. Team X was initially thrilled: They offered to collaborate whole heartedly, and would have loved to add an ML based feather to their already impressive cap. The product owners and analysts were totally onboard as well: They saw a chance to get in on the whole Data Science hype that they kept hearing about. But through some weird mix of arrogance and insecurity, team Y refused to collaborate with them or share any of their long term goals with them, even as they went to other parts of the company giving brown bag presentations and tutorials on the new model they created.  Team X got resentful: from what they saw of team Y's model, their approach was hopelessly naive and had little chances of scaling or being sustainable in production, and they knew exactly how to help with that. Deploying the model to production would have taken them a few days, given how comfortable they were with DevOps and continuous delivery (team Y had taken several months to figure out how to deploy a simple R script to production). And despite how old school their own tech was, team X were crafty enough to be able to plug it in to their existing architecture. Moreover, the output of the model was such that it didn't take into account how the business will consume it or how it was going to be fed to downstream systems, and the product owners could have gone a long way in making the model more amenable to adoption by the business stakeholders. But team Y wouldn't listen, and their leads brushed off any attempts at communication, let alone collaboration. The vibe that team Y was giving off was "We are the cutting edge ML team, you guys are the legacy server grunts. We don't need your opinion.", and they seemed to have a complete disregard for domain knowledge, or worse, they thought that all that domain knowledge consisted of was being able to grasp the definitions of a few business metrics.  Team X got frustrated and tried to express their concerns to leadership. But despite owning a vital link in Company A's business process, they were only \~50 people in a large 1000 strong technology and operations org, and they were several layers removed from the C-suite, so it was impossible for them to get their voices heard.  Meanwhile, the unstoppable director was doing what he did best: Playing corporate politics. Despite how little his team had actually delivered, he had convinced the board that all analysis and optimization tasks should now be migrated to his yet to be delivered ML platform. Since most leaders now knew that there was overlap between team Y and team X's objectives, his pitch was no longer that team Y was going to create a new insight, but that they were going to replace (or modernize) the legacy statistics based on-prem tools with more accurate cloud based ML tools. Never mind that there was no support in the academic literature for the idea that Naive Bayes works better than the Econometric approaches used by team X, let alone the additional wacky idea that Bayesian Optimization would definitely outperform the QP solvers that were running in production.  Unbeknownst to team X, the original Bayesian risk analysis project has now grown into a multimillion dollar major overhaul initiative, which included the eventual replacement of all of the tools and functions supported by team X along with the necessary migration to the cloud. The CIO and a couple of business VPs are on now board, and tech leadership is treating it as a done deal. An outside vendor, a startup who nobody had heard of, was contracted to help build the platform, since team Y has no engineering skills. The choice was deliberate, as calling on any of the established consulting or software companies would have eventually led leadership to the conclusion that team X was better suited for a transformation on this scale than team Y.  Team Y has no experience with any major ERP deployments, and no domain knowledge, yet they are being tasked with fundamentally changing the business process that is at the core of Company A's business. Their models actually perform worse than those deployed by team X, and their architecture is hopelessly simplistic, compared to what is necessary for running such a solution in production.  Ironically, using Bayesian thinking and based on all the evidence, the likelihood that team Y succeeds is close to 0%. At best, the project is going to end up being a write off of 50 million dollars or more. Once the !@#$!@hits the fan, a couple of executive heads are going to role, and dozens of people will get laid off. At worst, given how vital risk analysis and portfolio optimization is to Company A's revenue stream, the failure will eventually sink the whole company. It probably won't go bankrupt, but it will lose a significant portion of its business and work force. Failed ERP implementations can and do sink large companies: Just see what happened to National Grid US, SuperValu or Target Canada.  One might argue that this is more about corporate disfunction and bad leadership than about data science and AI. But I disagree. I think the core driver of this debacle is indeed the blind faith in Data Scientists, ML models and the promise of AI, and the overall culture of hype and self promotion that is very common among the ML crowd.  We haven't seen the end of this story: I sincerely hope that this ends well for the sake of my colleagues and all involved. Company A is a good company, and both its customers and its employees deserver better. But the chances of that happening are negligible given all the information available, and this failure will hit my company hard.

[D] Here are 17 ways of making PyTorch training faster – what did I miss?
reddit
LLM Vibe Score0
Human Vibe Score1
lorenzkuhnThis week

[D] Here are 17 ways of making PyTorch training faster – what did I miss?

I've been collecting methods to accelerate training in PyTorch – here's what I've found so far. What did I miss? What did I get wrong? The methods – roughly sorted from largest to smallest expected speed-up – are: Consider using a different learning rate schedule. Use multiple workers and pinned memory in DataLoader. Max out the batch size. Use Automatic Mixed Precision (AMP). Consider using a different optimizer. Turn on cudNN benchmarking. Beware of frequently transferring data between CPUs and GPUs. Use gradient/activation checkpointing. Use gradient accumulation. Use DistributedDataParallel for multi-GPU training. Set gradients to None rather than 0. Use .as\_tensor rather than .tensor() Turn off debugging APIs if not needed. Use gradient clipping. Turn off bias before BatchNorm. Turn off gradient computation during validation. Use input and batch normalization. Consider using another learning rate schedule The learning rate (schedule) you choose has a large impact on the speed of convergence as well as the generalization performance of your model. Cyclical Learning Rates and the 1Cycle learning rate schedule are both methods introduced by Leslie N. Smith (here and here), and then popularised by fast.ai's Jeremy Howard and Sylvain Gugger (here and here). Essentially, the 1Cycle learning rate schedule looks something like this: ​ https://preview.redd.it/sc37u5knmxa61.png?width=476&format=png&auto=webp&s=09b309b4dbd67eedb4ab5f86e03e0e83d7b072d1 Sylvain writes: \[1cycle consists of\]  two steps of equal lengths, one going from a lower learning rate to a higher one than go back to the minimum. The maximum should be the value picked with the Learning Rate Finder, and the lower one can be ten times lower. Then, the length of this cycle should be slightly less than the total number of epochs, and, in the last part of training, we should allow the learning rate to decrease more than the minimum, by several orders of magnitude. In the best case this schedule achieves a massive speed-up – what Smith calls Superconvergence – as compared to conventional learning rate schedules. Using the 1Cycle policy he needs \~10x fewer training iterations of a ResNet-56 on ImageNet to match the performance of the original paper, for instance). The schedule seems to perform robustly well across common architectures and optimizers. PyTorch implements both of these methods torch.optim.lrscheduler.CyclicLR and torch.optim.lrscheduler.OneCycleLR, see the documentation. One drawback of these schedulers is that they introduce a number of additional hyperparameters. This post and this repo, offer a nice overview and implementation of how good hyper-parameters can be found including the Learning Rate Finder mentioned above. Why does this work? It doesn't seem entirely clear but one possible explanation might be that regularly increasing the learning rate helps to traverse saddle points in the loss landscape more quickly. Use multiple workers and pinned memory in DataLoader When using torch.utils.data.DataLoader, set numworkers > 0, rather than the default value of 0, and pinmemory=True, rather than the default value of False. Details of this are explained here. Szymon Micacz achieves a 2x speed-up for a single training epoch by using four workers and pinned memory. A rule of thumb that people are using to choose the number of workers is to set it to four times the number of available GPUs with both a larger and smaller number of workers leading to a slow down. Note that increasing num\_workerswill increase your CPU memory consumption. Max out the batch size This is a somewhat contentious point. Generally, however, it seems like using the largest batch size your GPU memory permits will accelerate your training (see NVIDIA's Szymon Migacz, for instance). Note that you will also have to adjust other hyperparameters, such as the learning rate, if you modify the batch size. A rule of thumb here is to double the learning rate as you double the batch size. OpenAI has a nice empirical paper on the number of convergence steps needed for different batch sizes. Daniel Huynh runs some experiments with different batch sizes (also using the 1Cycle policy discussed above) where he achieves a 4x speed-up by going from batch size 64 to 512. One of the downsides of using large batch sizes, however, is that they might lead to solutions that generalize worse than those trained with smaller batches. Use Automatic Mixed Precision (AMP) The release of PyTorch 1.6 included a native implementation of Automatic Mixed Precision training to PyTorch. The main idea here is that certain operations can be run faster and without a loss of accuracy at semi-precision (FP16) rather than in the single-precision (FP32) used elsewhere. AMP, then, automatically decide which operation should be executed in which format. This allows both for faster training and a smaller memory footprint. In the best case, the usage of AMP would look something like this: import torch Creates once at the beginning of training scaler = torch.cuda.amp.GradScaler() for data, label in data_iter: optimizer.zero_grad() Casts operations to mixed precision with torch.cuda.amp.autocast(): loss = model(data) Scales the loss, and calls backward() to create scaled gradients scaler.scale(loss).backward() Unscales gradients and calls or skips optimizer.step() scaler.step(optimizer) Updates the scale for next iteration scaler.update() Benchmarking a number of common language and vision models on NVIDIA V100 GPUs, Huang and colleagues find that using AMP over regular FP32 training yields roughly 2x – but upto 5.5x – training speed-ups. Currently, only CUDA ops can be autocast in this way. See the documentation here for more details on this and other limitations. u/SVPERBlA points out that you can squeeze out some additional performance (\~ 20%) from AMP on NVIDIA Tensor Core GPUs if you convert your tensors to the Channels Last memory format. Refer to this section in the NVIDIA docs for an explanation of the speedup and more about NCHW versus NHWC tensor formats. Consider using another optimizer AdamW is Adam with weight decay (rather than L2-regularization) which was popularized by fast.ai and is now available natively in PyTorch as torch.optim.AdamW. AdamW seems to consistently outperform Adam in terms of both the error achieved and the training time. See this excellent blog post on why using weight decay instead of L2-regularization makes a difference for Adam. Both Adam and AdamW work well with the 1Cycle policy described above. There are also a few not-yet-native optimizers that have received a lot of attention recently, most notably LARS (pip installable implementation) and LAMB. NVIDA's APEX implements fused versions of a number of common optimizers such as Adam. This implementation avoid a number of passes to and from GPU memory as compared to the PyTorch implementation of Adam, yielding speed-ups in the range of 5%. Turn on cudNN benchmarking If your model architecture remains fixed and your input size stays constant, setting torch.backends.cudnn.benchmark = True might be beneficial (docs). This enables the cudNN autotuner which will benchmark a number of different ways of computing convolutions in cudNN and then use the fastest method from then on. For a rough reference on the type of speed-up you can expect from this, Szymon Migacz achieves a speed-up of 70% on a forward pass for a convolution and a 27% speed-up for a forward + backward pass of the same convolution. One caveat here is that this autotuning might become very slow if you max out the batch size as mentioned above. Beware of frequently transferring data between CPUs and GPUs Beware of frequently transferring tensors from a GPU to a CPU using tensor.cpu() and vice versa using tensor.cuda() as these are relatively expensive. The same applies for .item() and .numpy() – use .detach() instead. If you are creating a new tensor, you can also directly assign it to your GPU using the keyword argument device=torch.device('cuda:0'). If you do need to transfer data, using .to(non_blocking=True), might be useful as long as you don't have any synchronization points after the transfer. If you really have to, you might want to give Santosh Gupta's SpeedTorch a try, although it doesn't seem entirely clear when this actually does/doesn't provide speed-ups. Use gradient/activation checkpointing Quoting directly from the documentation: Checkpointing works by trading compute for memory. Rather than storing all intermediate activations of the entire computation graph for computing backward, the checkpointed part does not save intermediate activations, and instead recomputes them in backward pass. It can be applied on any part of a model. Specifically, in the forward pass, function will run in torch.no\grad() manner, i.e., not storing the intermediate activations. Instead, the forward pass saves the inputs tuple and the functionparameter. In the backwards pass, the saved inputs and function is retrieved, and the forward pass is computed on function again, now tracking the intermediate activations, and then the gradients are calculated using these activation values. So while this will might slightly increase your run time for a given batch size, you'll significantly reduce your memory footprint. This in turn will allow you to further increase the batch size you're using allowing for better GPU utilization. While checkpointing is implemented natively as torch.utils.checkpoint(docs), it does seem to take some thought and effort to implement properly. Priya Goyal has a good tutorial demonstrating some of the key aspects of checkpointing. Use gradient accumulation Another approach to increasing the batch size is to accumulate gradients across multiple .backward() passes before calling optimizer.step(). Following a post by Hugging Face's Thomas Wolf, gradient accumulation can be implemented as follows: model.zero_grad() Reset gradients tensors for i, (inputs, labels) in enumerate(training_set): predictions = model(inputs) Forward pass loss = loss_function(predictions, labels) Compute loss function loss = loss / accumulation_steps Normalize our loss (if averaged) loss.backward() Backward pass if (i+1) % accumulation_steps == 0: Wait for several backward steps optimizer.step() Now we can do an optimizer step model.zero_grad() Reset gradients tensors if (i+1) % evaluation_steps == 0: Evaluate the model when we... evaluate_model() ...have no gradients accumulate This method was developed mainly to circumvent GPU memory limitations and I'm not entirely clear on the trade-off between having additional .backward() loops. This discussion on the fastai forum seems to suggest that it can in fact accelerate training, so it's probably worth a try. Use Distributed Data Parallel for multi-GPU training Methods to accelerate distributed training probably warrant their own post but one simple one is to use torch.nn.DistributedDataParallel rather than torch.nn.DataParallel. By doing so, each GPU will be driven by a dedicated CPU core avoiding the GIL issues of DataParallel. In general, I can strongly recommend reading the documentation on distributed training. Set gradients to None rather than 0 Use .zerograd(settonone=True) rather than .zerograd(). Doing so will let the memory allocator handle the gradients rather than actively setting them to 0. This will lead to yield a modest speed-up as they say in the documentation, so don't expect any miracles. Watch out, doing this is not side-effect free! Check the docs for the details on this. Use .as_tensor() rather than .tensor() torch.tensor() always copies data. If you have a numpy array that you want to convert, use torch.astensor() or torch.fromnumpy() to avoid copying the data. Turn on debugging tools only when actually needed PyTorch offers a number of useful debugging tools like the autograd.profiler, autograd.grad\check, and autograd.anomaly\detection. Make sure to use them to better understand when needed but to also turn them off when you don't need them as they will slow down your training. Use gradient clipping Originally used to avoid exploding gradients in RNNs, there is both some empirical evidence as well as some theoretical support that clipping gradients (roughly speaking: gradient = min(gradient, threshold)) accelerates convergence. Hugging Face's Transformer implementation is a really clean example of how to use gradient clipping as well as some of the other methods such as AMP mentioned in this post. In PyTorch this can be done using torch.nn.utils.clipgradnorm(documentation). It's not entirely clear to me which models benefit how much from gradient clipping but it seems to be robustly useful for RNNs, Transformer-based and ResNets architectures and a range of different optimizers. Turn off bias before BatchNorm This is a very simple one: turn off the bias of layers before BatchNormalization layers. For a 2-D convolutional layer, this can be done by setting the bias keyword to False: torch.nn.Conv2d(..., bias=False, ...).  (Here's a reminder why this makes sense.) You will save some parameters, I would however expect the speed-up of this to be relatively small as compared to some of the other methods mentioned here. Turn off gradient computation during validation This one is straightforward: set torch.no_grad() during validation. Use input and batch normalization You're probably already doing this but you might want to double-check: Are you normalizing your input? Are you using batch-normalization? And here's a reminder of why you probably should. Bonus tip from the comments: Use JIT to fuse point-wise operations. If you have adjacent point-wise operations you can use PyTorch JIT to combine them into one FusionGroup which can then be launched on a single kernel rather than multiple kernels as would have been done per default. You'll also save some memory reads and writes. Szymon Migacz shows how you can use the @torch.jit.script decorator to fuse the operations in a GELU, for instance: @torch.jit.script def fused_gelu(x): return x 0.5 (1.0 + torch.erf(x / 1.41421)) In this case, fusing the operations leads to a 5x speed-up for the execution of fused_gelu as compared to the unfused version. See also this post for an example of how Torchscript can be used to accelerate an RNN. Hat tip to u/Patient_Atmosphere45 for the suggestion. Sources and additional resources Many of the tips listed above come from Szymon Migacz' talk and post in the PyTorch docs. PyTorch Lightning's William Falcon has two interesting posts with tips to speed-up training. PyTorch Lightning does already take care of some of the points above per-default. Thomas Wolf at Hugging Face has a number of interesting articles on accelerating deep learning – with a particular focus on language models. The same goes for Sylvain Gugger and Jeremy Howard: they have many interesting posts in particular on learning rates and AdamW. Thanks to Ben Hahn, Kevin Klein and Robin Vaaler for their feedback on a draft of this post! I've also put all of the above into this blog post.

5-Day Applied Rationality Workshop for Machine Learning Students & Researchers
reddit
LLM Vibe Score0
Human Vibe Score1
AnnaSalamonThis week

5-Day Applied Rationality Workshop for Machine Learning Students & Researchers

The Center for Applied Rationality is a Berkeley-based nonprofit that runs immersive workshops for entrepreneurs, researchers, students, and other ambitious, analytical, practically-minded people. The practice of “applied rationality”, which the workshops aim towards, involves noticing what cognitive algorithms you seem to be running, checking whether those algorithms seem to be helping you form accurate beliefs and achieve your goals, and looking for ways to improve them. A typical 4-day CFAR workshop costs $3900 to attend, but thanks to a generous grant from the Future of Life Institute this fall we will be running a free five-day workshop for students and researchers in the fields of machine learning and artificial intelligence. All costs are covered by this grant, including room, board, and flights. The workshop will take place this Aug 30 through Sep 4 in the San Francisco Bay Area and will include: 2 days focused on learning models and skills, such as how habits develop and how to redesign your habits. 2 days focused on practicing skills and applying them to whichever areas of your life you would like to make improvements on, such as how to make faster progress on projects or how to have more productive collaborations with colleagues. 1 day (special to this workshop) focused on discussion of the long-term impact of artificial intelligence, and on what reasoning habits — if spread across the relevant research communities — may increase the probability of positive long-term AI outcomes. Go here to read more or to apply, or ask questions here.

[Discussion] When ML and Data Science are the death of a good company: A cautionary tale.
reddit
LLM Vibe Score0
Human Vibe Score0.6
AlexSnakeKingThis week

[Discussion] When ML and Data Science are the death of a good company: A cautionary tale.

TD;LR: At Company A, Team X does advanced analytics using on-prem ERP tools and older programming languages. Their tools work very well and are designed based on very deep business and domain expertise. Team Y is a new and ambitious Data Science team that thinks they can replace Team X's tools with a bunch of R scripts and a custom built ML platform. Their models are simplistic, but more "fashionable" compared to the econometric models used by Team X, and team Y benefits from the ML/DS moniker so leadership is allowing Team Y to start a large scale overhaul of the analytics platform in question. Team Y doesn't have the experience for such a larger scale transformation, and is refusing to collaborate with team X. This project is very likely going to fail, and cause serious harm to the company as a whole financially and from a people perspective. I argue that this is not just because of bad leadership, but also because of various trends and mindsets in the DS community at large. Update (Jump to below the line for the original story): Several people in the comments are pointing out that this just a management failure, not something due to ML/DS, and that you can replace DS with any buzz tech and the story will still be relevant. My response: Of course, any failure at an organization level is ultimately a management failure one way or the other. Moreover, it is also the case that ML/DS when done correctly, will always improve a company's bottom line. There is no scenario where the proper ML solution, delivered at a reasonable cost and in a timely fashion, will somehow hurt the company's bottom line. My point is that in this case management is failing because of certain trends and practices that are specific to the ML/DS community, namely: The idea that DS teams should operate independently of tech and business orgs -- too much autonomy for DS teams The disregard for domain knowledge that seems prevalent nowadays thanks to the ML hype, that DS can be generalists and someone with good enough ML chops can solve any business problem. That wasn't the case when I first left academia for the industry in 2009 (back then nobody would even bother with a phone screen if you didn't have the right domain knowledge). Over reliance on resources who check all the ML hype related boxes (knows Python, R, Tensorflow, Shiny, etc..., has the right Coursera certifications, has blogged on the topic, etc...), but are lacking in depth of experience. DS interviews nowadays all seem to be: Can you tell me what a p-value is? What is elastic net regression? Show me how to fit a model in sklearn? How do you impute NAs in an R dataframe? Any smart person can look those up on Stackoverflow or Cross-Validated,.....Instead teams should be asking stuff like: why does portfolio optimization use QP not LP? How does a forecast influence a customer service level? When should a recommendation engine be content based and when should it use collaborative filtering? etc... (This is a true story, happening to the company I currently work for. Names, domains, algorithms, and roles have been shuffled around to protect my anonymity)  Company A has been around for several decades. It is not the biggest name in its domain, but it is a well respected one. Risk analysis and portfolio optimization have been a core of Company A's business since the 90s. They have a large team of 30 or so analysts who perform those tasks on a daily basis. These analysts use ERP solutions implemented for them by one the big ERP companies (SAP, Teradata, Oracle, JD Edwards,...) or one of the major tech consulting companies (Deloitte, Accenture, PWC, Capgemini, etc...) in collaboration with their own in house engineering team. The tools used are embarrassingly old school: Classic RDBMS running on on-prem servers or maybe even on mainframes, code written in COBOL, Fortran, weird proprietary stuff like ABAP or SPSS.....you get the picture. But the models and analytic functions were pretty sophisticated, and surprisingly cutting edge compared to the published academic literature. Most of all, they fit well with the company's enterprise ecosystem, and were honed based on years of deep domain knowledge.  They have a tech team of several engineers (poached from the aforementioned software and consulting companies) and product managers (who came from the experienced pools of analysts and managers who use the software, or poached from business rivals) maintaining and running this software. Their technology might be old school, but collectively, they know the domain and the company's overall architecture very, very well. They've guided the company through several large scale upgrades and migrations and they have a track record of delivering on time, without too much overhead. The few times they've stumbled, they knew how to pick themselves up very quickly. In fact within their industry niche, they have a reputation for their expertise, and have very good relations with the various vendors they've had to deal with. They were the launching pad of several successful ERP consulting careers.  Interestingly, despite dealing on a daily basis with statistical modeling and optimization algorithms, none of the analysts, engineers, or product managers involved describe themselves as data scientists or machine learning experts. It is mostly a cultural thing: Their expertise predates the Data Science/ML hype that started circa 2010, and they got most of their chops using proprietary enterprise tools instead of the open source tools popular nowadays. A few of them have formal statistical training, but most of them came from engineering or domain backgrounds and learned stats on the fly while doing their job. Call this team "Team X".  Sometime around the mid 2010s, Company A started having some serious anxiety issues: Although still doing very well for a company its size, overall economic and demographic trends were shrinking its customer base, and a couple of so called disruptors came up with a new app and business model that started seriously eating into their revenue. A suitable reaction to appease shareholders and Wall Street was necessary. The company already had a decent website and a pretty snazzy app, what more could be done? Leadership decided that it was high time that AI and ML become a core part of the company's business. An ambitious Manager, with no science or engineering background, but who had very briefly toyed with a recommender system a couple of years back, was chosen to build a data science team, call it team "Y" (he had a bachelor's in history from the local state college and worked for several years in the company's marketing org). Team "Y" consists mostly of internal hires who decided they wanted to be data scientists and completed a Coursera certification or a Galvanize boot camp, before being brought on to the team, along with a few of fresh Ph.D or M.Sc holders who didn't like academia and wanted to try their hand at an industry role. All of them were very bright people, they could write great Medium blog posts and give inspiring TED talks, but collectively they had very little real world industry experience. As is the fashion nowadays, this group was made part of a data science org that reported directly to the CEO and Board, bypassing the CIO and any tech or business VPs, since Company A wanted to claim the monikers "data driven" and "AI powered" in their upcoming shareholder meetings. In 3 or 4 years of existence, team Y produced a few Python and R scripts. Their architectural experience  consisted almost entirely in connecting Flask to S3 buckets or Redshift tables, with a couple of the more resourceful ones learning how to plug their models into Tableau or how to spin up a Kuberneties pod.  But they needn't worry: The aforementioned manager, who was now a director (and was also doing an online Masters to make up for his qualifications gap and bolster his chances of becoming VP soon - at least he now understands what L1 regularization is), was a master at playing corporate politics and self-promotion. No matter how few actionable insights team Y produced or how little code they deployed to production, he always had their back and made sure they had ample funding. In fact he now had grandiose plans for setting up an all-purpose machine learning platform that can be used to solve all of the company's data problems.  A couple of sharp minded members of team Y, upon googling their industry name along with the word "data science", realized that risk analysis was a prime candidate for being solved with Bayesian models, and there was already a nifty R package for doing just that, whose tutorial they went through on R-Bloggers.com. One of them had even submitted a Bayesian classifier Kernel for a competition on Kaggle (he was 203rd on the leaderboard), and was eager to put his new-found expertise to use on a real world problem. They pitched the idea to their director, who saw a perfect use case for his upcoming ML platform. They started work on it immediately, without bothering to check whether anybody at Company A was already doing risk analysis. Since their org was independent, they didn't really need to check with anybody else before they got funding for their initiative. Although it was basically a Naive Bayes classifier, the term ML was added to the project tile, to impress the board.  As they progressed with their work however, tensions started to build. They had asked the data warehousing and CA analytics teams to build pipelines for them, and word eventually got out to team X about their project. Team X was initially thrilled: They offered to collaborate whole heartedly, and would have loved to add an ML based feather to their already impressive cap. The product owners and analysts were totally onboard as well: They saw a chance to get in on the whole Data Science hype that they kept hearing about. But through some weird mix of arrogance and insecurity, team Y refused to collaborate with them or share any of their long term goals with them, even as they went to other parts of the company giving brown bag presentations and tutorials on the new model they created.  Team X got resentful: from what they saw of team Y's model, their approach was hopelessly naive and had little chances of scaling or being sustainable in production, and they knew exactly how to help with that. Deploying the model to production would have taken them a few days, given how comfortable they were with DevOps and continuous delivery (team Y had taken several months to figure out how to deploy a simple R script to production). And despite how old school their own tech was, team X were crafty enough to be able to plug it in to their existing architecture. Moreover, the output of the model was such that it didn't take into account how the business will consume it or how it was going to be fed to downstream systems, and the product owners could have gone a long way in making the model more amenable to adoption by the business stakeholders. But team Y wouldn't listen, and their leads brushed off any attempts at communication, let alone collaboration. The vibe that team Y was giving off was "We are the cutting edge ML team, you guys are the legacy server grunts. We don't need your opinion.", and they seemed to have a complete disregard for domain knowledge, or worse, they thought that all that domain knowledge consisted of was being able to grasp the definitions of a few business metrics.  Team X got frustrated and tried to express their concerns to leadership. But despite owning a vital link in Company A's business process, they were only \~50 people in a large 1000 strong technology and operations org, and they were several layers removed from the C-suite, so it was impossible for them to get their voices heard.  Meanwhile, the unstoppable director was doing what he did best: Playing corporate politics. Despite how little his team had actually delivered, he had convinced the board that all analysis and optimization tasks should now be migrated to his yet to be delivered ML platform. Since most leaders now knew that there was overlap between team Y and team X's objectives, his pitch was no longer that team Y was going to create a new insight, but that they were going to replace (or modernize) the legacy statistics based on-prem tools with more accurate cloud based ML tools. Never mind that there was no support in the academic literature for the idea that Naive Bayes works better than the Econometric approaches used by team X, let alone the additional wacky idea that Bayesian Optimization would definitely outperform the QP solvers that were running in production.  Unbeknownst to team X, the original Bayesian risk analysis project has now grown into a multimillion dollar major overhaul initiative, which included the eventual replacement of all of the tools and functions supported by team X along with the necessary migration to the cloud. The CIO and a couple of business VPs are on now board, and tech leadership is treating it as a done deal. An outside vendor, a startup who nobody had heard of, was contracted to help build the platform, since team Y has no engineering skills. The choice was deliberate, as calling on any of the established consulting or software companies would have eventually led leadership to the conclusion that team X was better suited for a transformation on this scale than team Y.  Team Y has no experience with any major ERP deployments, and no domain knowledge, yet they are being tasked with fundamentally changing the business process that is at the core of Company A's business. Their models actually perform worse than those deployed by team X, and their architecture is hopelessly simplistic, compared to what is necessary for running such a solution in production.  Ironically, using Bayesian thinking and based on all the evidence, the likelihood that team Y succeeds is close to 0%. At best, the project is going to end up being a write off of 50 million dollars or more. Once the !@#$!@hits the fan, a couple of executive heads are going to role, and dozens of people will get laid off. At worst, given how vital risk analysis and portfolio optimization is to Company A's revenue stream, the failure will eventually sink the whole company. It probably won't go bankrupt, but it will lose a significant portion of its business and work force. Failed ERP implementations can and do sink large companies: Just see what happened to National Grid US, SuperValu or Target Canada.  One might argue that this is more about corporate disfunction and bad leadership than about data science and AI. But I disagree. I think the core driver of this debacle is indeed the blind faith in Data Scientists, ML models and the promise of AI, and the overall culture of hype and self promotion that is very common among the ML crowd.  We haven't seen the end of this story: I sincerely hope that this ends well for the sake of my colleagues and all involved. Company A is a good company, and both its customers and its employees deserver better. But the chances of that happening are negligible given all the information available, and this failure will hit my company hard.

I tested hundreds of marketing tools in the last three years and these 50 made it to the list. I'll sum up my top 50 marketing tools with one or two sentences + give you pricings.
reddit
LLM Vibe Score0
Human Vibe Score1
SpicyCopyThis week

I tested hundreds of marketing tools in the last three years and these 50 made it to the list. I'll sum up my top 50 marketing tools with one or two sentences + give you pricings.

Hey guys, I'm working in a growth marketing agency. Marketing tools are 30% of what we do, so we use them a lot and experiment with the new ones as much as possible. There are thousands of tools and it's easy to get lost, so I wanted to share the tools we use most on a daily basis. And divide the list into 14 categories. I thought this could be handy for Entrepreneurs subreddit. Why adopt tools? I see marketing tools as tireless colleagues. If you can't hire an employee, choosing the right tool can solve your problems, because they Are super cheap. Work 7/24 for you. Don’t make mistakes. Don’t need management. (or needless management) Help you to automate the majority of your lead gen process. Onwards to the list. (With the pricings post ended up quite long, you can find a link in the end if you want to check the prices) Email marketing tools #1 ActiveCampaign is armed with the most complicated email automation features and has the most intuitive user experience. It feels like you already know how to use it. \#2 Autopilot is visual marketing automation and customer journey tool that helps you acquire, nurture based on behaviors, interest etc. #3 Mailjet: This is the tool we use to send out bulky email campaigns such as newsletters. It doesn't have sexy features like others but does its job for a cheap price. Email address finders #4 Skrapp finds email of your contacts by name and company. It also works with LinkedIn Sales Navigator and can extract thousands of emails in bulk + have a browser add-on. #5 Hunter: Similar to Skrapp but doesn't work with LinkedIn Sales Navigator directly. In addition, there are email templates and you can set up email campaigns. Prospecting and outreach tools #6 Prospect combines the personal emails, follow-up calls, other social touches and helps you create multichannel campaigns.  #7 Reply is a more intuitive version of Prospect. It is easy to learn and use; their UX makes you feel good and sufficient.  CRM tools #8 Salesflare helps you to stop managing your data and start managing your customers. Not yet popular as Hubspot and etc but the best solution for smaller B2B businesses. (we're fans) \#9 Hubspot: The most popular CRM for good reason and has a broader product range you can adopt in your next steps. Try this if you have a bulky list of customers because it is free. #10 Pardot: Pardot is by Salesforce, it's armed with features that can close the gap between marketing and sales. Sales Tools #11 Salesforce is the best sales automation and lead management software. It helps you to create complicated segmentations and run, track, analyze campaigns from the same dashboard. #12 LinkedIn Sales Navigator gives you full access to LinkedIn's user database. You can even find a kidnapped CEO if you know how to use it with other marketing automation tools like Skrapp. #13 Pipedrive is a simple tool and excels in one thing. It tracks your leads and tells you when to take the next action. It makes sales easier. #14 Qwilr creates great-looking docs, at speed. You can design perfect proposals, quotes, client updates, and more in a flash. We use it a lot to close deals, it's effective. #15 Crystalknows is an add-on that tells you anyone’s personality on LinkedIn and gives you a detailed approach specific to that person. It's eerily accurate. #16 Leadfeeder shows you the companies that visited your website. Tells how they found you and what they’re interested in. It has a free version. Communication Tools #17 Intercom is a sweet and smart host that welcomes your visitors when you’re not home. It’s one of the best chatbot tools in the market. #18 Drift is famous for its conversational marketing features and more sales-focused than Intercom. #19 Manychat is a chatbot that helps you create high converting Facebook campaigns. #20 Plann3r helps you create your personalized meeting page. You can schedule meetings witch clients, candidates, and prospects. #21 Loom is a video messaging tool, it helps you to be more expressive and create closer relationships. #22 Callpage collects your visitors’ phone number and connects you with them in seconds. No matter where you are. Landing page tools #23 Instapage is the best overall landing page builder. It has a broad range of features and even squirrel can build a compelling landing page with templates. No coding needed. #24 Unbounce can do everything that Instapage does and lets you build a great landing page without a developer. But it's less intuitive. Lead generation / marketing automation tools #25 Phantombuster is by far the most used lead generation software in our tool kit. It extracts data, emails, sends requests, customized messages, and does many things on autopilot in any platform. You can check this, this and this if you want to see it in action. #26 Duxsoup is a Google Chrome add-on and can also automate some of LinkedIn lead generation efforts like Phantombuster. But not works in the cloud. #27 Zapier is a glue that holds all the lead generation tools together. With Zapier, You can connect different marketing tools and no coding required. Conversion rate optimization tools #28 Hotjar tracks what people are doing on your website by recording sessions and capturing mouse movements. Then it gives you a heatmap. #29 UsabilityHub shows your page to a digital crowd and measures the first impressions and helps you to validate your ideas. #30 Optinmonster is a top tier conversion optimization tool. It helps you to capture leads and enables you to increase conversions rates with many features. #31 Notifia is one mega tool of widgets that arms your website with the wildest social proof and lead capturing tactics. #32 Sumo is a much simpler version of Notifia. But Sumo has everything to help you capture leads and build your email lists. Web scrapers #33 Data Miner is a Google Chrome browser extension that helps you scrape data from web pages and into a CSV file or Excel spreadsheet. #34 Webscraper does the same thing as Data Miner; however, it is capable of handling more complex tasks. SEO and Content #35 Grammarly: Your English could be your first language and your grammar could be better than Shakespeare. Grammarly still can make your writing better. #36 Hemingwayapp is a copywriting optimization tool that gives you feedback about your copy and improves your readability score, makes your writing bolder and punchier. Free. #37 Ahrefs is an all-rounder search engine optimization tool that helps you with off-page, on-page or technical SEO. #38 SurferSEO makes things easier for your on-page SEO efforts. It’s a tool that analyzes top Google results for specific keywords and gives you a content brief based on that data. Video editing and design tools #39 Canva is a graphic design platform that makes everything easy. It has thousands of templates for anything from Facebook ads, stylish presentations to business cards.  #40 Kapwing is our go-to platform for quick video edits. It works on the browser and can help you to create stylish videos, add subtitles, resize videos, create memes, or remove backgrounds. #41 Animoto can turn your photos and video clip into beautiful video slideshows. It comes handy when you want to create an advertising material but don’t have a budget. Advertising tools #42 AdEspresso lets you create and test multiple ads with few clicks. You can optimize your FB, IG, and Google ads from this tool and measure your ads with in-depth analytics. #43 AdRoll is an AI-driven platform that connects and coordinates marketing efforts across ads, email, and online stores. Other tools #44 Replug helps you to shorten, track, optimize your links with call-to-actions, branded links, and retargeting pixels #45 Draw.io = Mindmaps, schemes, and charts. With Draw.io, you can put your brain in a digital paper in an organized way. #46 Built With is a tool that finds out what websites are built with. So you can see what tools they're using and so on. #47 Typeform can turn data collection into an experience with Typeform. This tool helps you to engage your audience with conversational forms or surveys and help you to collect more data. #48 Livestorm helped us a lot, especially in COVID-19 tiles. It’s a webinar software that works on your browser, mobile, and desktop. #49 Teachable \- If you have an online course idea but hesitating because of the production process, Teachable can help you. It's easy to configure and customizable for your needs. #50 Viral Loops provides a revolutionary referral marketing solution for modern marketers. You can create and run referral campaigns in a few clicks with templates. Remember, most of these tools have a free trial or free version. Going over them one by one can teach you a lot and help you grow your business with less work power in the early stages of your business. I hope you enjoyed the read and can find some tools to make things easier! Let me know about your favorite tools in the comments, so I can try them out. \------ If you want to check the prices and see a broader explanation about the tools, you can go here.

The delicate balance of building an online community business
reddit
LLM Vibe Score0
Human Vibe Score0.895
matthewbarbyThis week

The delicate balance of building an online community business

Hey /r/Entrepreneur 👋 Just under two years ago I launched an online community business called Traffic Think Tank with two other co-founders, Nick Eubanks and Ian Howells. As a Traffic Think Tank customer you (currently) pay $119 a month to get access to our online community, which is run through Slack. The community is focused on helping you learn various aspects of marketing, with a particular focus on search engine optimization (SEO). Alongside access to the Slack community, we publish new educational video content from outside experts every week that all customers have access to. At the time of writing, Traffic Think Tank has around 650 members spanning across 17 of the 24 different global time zones. I was on a business trip over in Sydney recently, and during my time there I met up with some of our Australia-based community members. During dinner I was asked by several of them how the idea for Traffic Think Tank came about and what steps we took to validate that the idea was worth pursuing.  This is what I told them… How it all began It all started with a personal need. Nick, an already successful entrepreneur and owner of a marketing agency, had tested out an early version Traffic Think Tank in early 2017. He offered real-time consulting for around ten customers that he ran from Slack. He would publish some educational videos and offer his advice on projects that the members were running. The initial test went well, but it was tough to maintain on his own and he had to charge a fairly high price to make it worth his time. That’s when he spoke to me and Ian about turning this idea into something much bigger. Both Ian and I offered something slightly different to Nick. We’ve both spent time in senior positions at marketing agencies, but currently hold senior director positions in 2,000+ public employee companies (HubSpot and LendingTree). Alongside this, as a trio we could really ramp up the quality and quantity of content within the community, spread out the administrative workload and just generally have more resources to throw at getting this thing off the ground. Admittedly, Nick was much more optimistic about the potential of Traffic Think Tank – something I’m very thankful for now – whereas Ian and I were in the camp of “you’re out of your mind if you think hundreds of people are going to pay us to be a part of a Slack channel”. To validate the idea at scale, we decided that we’d get an initial MVP of the community up and running with a goal of reaching 100 paying customers in the first six months. If we achieved that, we’d validated that it was a viable business and we would continue to pursue it. If not, we’d kill it. We spent the next month building out the initial tech stack that enabled us to accept payments, do basic user management to the Slack channel, and get a one-page website up and running with information on what Traffic Think Tank was all about.  After this was ready, we doubled down on getting some initial content created for members – I mean, we couldn’t have people just land in an empty Slack channel, could we? We created around ten initial videos, 20 or so articles and then some long threads full of useful information within the Slack channel so that members would have some content to pour into right from the beginning.  Then, it was time to go live. The first 100 customers Fortunately, both Nick and I had built a somewhat substantial following in the SEO space over the previous 5-10 years, so we at least had a large email list to tap into (a total of around 40,000 people). We queued up some launch emails, set an initial price of $99 per month and pressed send. [\[LINK\] The launch email I sent to my subscribers announcing Traffic Think Tank](https://mailchi.mp/matthewbarby/future-of-marketing-1128181) What we didn’t expect was to sell all of the initial 100 membership spots in the first 72 hours. “Shit. What do we do now? Are we ready for this many people? Are we providing them with enough value? What if something breaks in our tech stack? What if they don’t like the content? What if everyone hates Slack?” All of these were thoughts running through my head. This brings me to the first great decision we made: we closed down new membership intake for 3 months so that we could focus completely on adding value to the first cohort of users. The right thing at the right time SEO is somewhat of a dark art to many people that are trying to learn about it for the first time. There’s hundreds of thousands (possibly millions) of articles and videos online that talk about how to do SEO.  Some of it’s good advice; a lot of it is very bad advice.  Add to this that the barrier to entry of claiming to be an “expert” in SEO is practically non-existent and you have a recipe for disaster. This is why, for a long time, individuals involved in SEO have flocked in their masses to online communities for information and to bounce ideas off of others in the space. Forums like SEObook, Black Hat World, WickedFire, Inbound.org, /r/BigSEO, and many more have, at one time, been called home by many SEOs.  In recent times, these communities have either been closed down or just simply haven’t adapted to the changing needs of the community – one of those needs being real-time feedback on real-world problems.  The other big need that we all spotted and personally had was the ability to openly share the things that are working – and the things that aren’t – in SEO within a private forum. Not everyone wanted to share their secret sauce with the world. One of the main reasons we chose Slack as the platform to run our community on was the fact that it solved these two core needs. It gave the ability to communicate in real-time across multiple devices, and all of the information shared within it was outside of the public domain. The other problem that plagued a lot of these early communities was spam. Most of them were web-based forums that were free to access. That meant they became a breeding ground for people trying to either sell their services or promote their own content – neither of which is conducive to building a thriving community. This was our main motivation for charging a monthly fee to access Traffic Think Tank. We spent a lot of time thinking through pricing. It needed to be enough money that people would be motivated to really make use of their membership and act in a way that’s beneficial to the community, but not too much money that it became cost prohibitive to the people that would benefit from it the most. Considering that most of our members would typically spend between $200-800 per month on SEO software, $99 initially felt like the perfect balance. Growing pains The first three months of running the community went by without any major hiccups. Members were incredibly patient with us, gave us great feedback and were incredibly helpful and accommodating to other members. Messages were being posted every day, with Nick, Ian and myself seeding most of the engagement at this stage.  With everything going smoothly, we decided that it was time to open the doors to another intake of new members. At this point we’d accumulated a backlog of people on our waiting list, so we knew that simply opening our doors would result in another large intake. Adding more members to a community has a direct impact on the value that each member receives. For Traffic Think Tank in particular, the value for members comes from three areas: The ability to have your questions answered by me, Nick and Ian, as well as other members of the community. The access to a large library of exclusive content. The ability to build connections with the wider community. In the early stages of membership growth, there was a big emphasis on the first of those three points. We didn’t have an enormous content library, nor did we have a particularly large community of members, so a lot of the value came from getting a lot of one-to-one time with the community founders. [\[IMAGE\] Screenshot of engagement within the Traffic Think Tank Slack community](https://cdn.shortpixel.ai/client/qglossy,retimg,w_1322/https://www.matthewbarby.com/wp-content/uploads/2019/08/Community-Engagement-in-Traffic-Think-Tank.png) The good thing about having 100 members was that it was just about feasible to give each and every member some one-to-one time within the month, which really helped us to deliver those moments of delight that the community needed early on. Two-and-a-half months after we launched Traffic Think Tank, we opened the doors to another 250 people, taking our total number of members to 350. This is where we experienced our first growing pains.  Our original members had become used to being able to drop us direct messages and expect an almost instant response, but this wasn’t feasible anymore. There were too many people, and we needed to create a shift in behavior. We needed more value to come from the community engaging with one another or we’d never be able to scale beyond this level. We started to really pay attention to engagement metrics; how many people were logging in every day, and of those, how many were actually posting messages within public channels.  We asked members that were logging in a lot but weren’t posting (the “lurkers”) why that was the case. We also asked the members that engaged in the community the most what motivated them to post regularly. We learned a lot from doing this. We found that the large majority of highly-engaged members had much more experience in SEO, whereas most of the “lurkers” were beginners. This meant that most of the information being shared in the community was very advanced, with a lot of feedback from the beginners in the group being that they “didn’t want to ask a stupid question”.  As managers of the community, we needed to facilitate conversations that catered to all of our members, not just those at a certain level of skill. To tackle this problem, we created a number of new channels that had a much deeper focus on beginner topics so novice members had a safe place to ask questions without judgment.  We also started running live video Q&As each month where we’d answer questions submitted by the community. This gave our members one-on-one time with me, Nick and Ian, but spread the value of these conversations across the whole community rather than them being hidden within private messages. As a result of these changes, we found that the more experienced members in the community were really enjoying sharing their knowledge with those with less experience. The number of replies within each question thread was really starting to increase, and the community started to shift away from just being a bunch of threads created by me, Nick and Ian to a thriving forum of diverse topics compiled by a diverse set of individuals. This is what we’d always wanted. A true community. It was starting to happen. [\[IMAGE\] Chart showing community engagement vs individual member value](https://cdn.shortpixel.ai/client/qglossy,retimg,w_1602/https://www.matthewbarby.com/wp-content/uploads/2019/08/Community-Engagement-Balance-Graph.jpg) At the same time, we started to realize that we’ll eventually reach a tipping point where there’ll be too much content for us to manage and our members to engage with. When we reach this point, the community will be tough to follow and the quality of any given post will go down. Not only that, but the community will become increasingly difficult to moderate. We’re not there yet, but we recognize that this will come, and we’ll have to adjust our model again. Advocating advocacy As we started to feel more comfortable about the value that members were receiving, we made the decision to indefinitely open for new members. At the same time, we increased the price of membership (from $99 a month to $119) in a bid to strike the right balance between profitability as a business and to slow down the rate at which we were reaching the tipping point of community size. We also made the decision to repay all of our early adopters by grandfathering them in to the original pricing – and committing to always do this in the future. Despite the price increase, we saw a continued flow of new members come into the community. The craziest part about this was that we were doing practically no marketing activities to encourage new members– this was all coming from word of mouth. Our members were getting enough value from the community that they were recommending it to their friends, colleagues and business partners.  The scale at which this was happening really took us by surprise and it told us one thing very clearly: delivering more value to members resulted in more value being delivered to the business. This is a wonderful dynamic to have because it perfectly aligns the incentives on both sides. We’d said from the start that we wouldn’t sacrifice value to members for more revenue – this is something that all three of us felt very strongly about. First and foremost, we wanted to create a community that delivered value to its members and was run in a way that aligned with our values as people. If we could find a way to stimulate brand advocacy, while also tightening the bonds between all of our individual community members, we’d be boosting both customer retention and customer acquisition in the same motion. This became our next big focus. [\[TWEET\] Adam, one of our members wore his Traffic Think Tank t-shirt in the Sahara desert](https://twitter.com/AdamGSteele/status/1130892481099382784) We started with some simple things: We shipped out Traffic Think Tank branded T-shirts to all new members. We’d call out each of the individuals that would submit questions to our live Q&A sessions and thank them live on air. We set up a new channel that was dedicated to sharing a quick introduction to who you are, what you do and where you’re based for all new members. We’d created a jobs channel and a marketplace for selling, buying and trading services with other members. Our monthly “blind dates” calls were started where you’d be randomly grouped with 3-4 other community members so that you could hop on a call to get to know each other better. The Traffic Think Tank In Real Life (IRL)* channel was born, which enabled members to facilitate in-person meetups with each other. In particular, we saw that as members started to meet in person or via calls the community itself was feeling more and more like a family. It became much closer knit and some members started to build up a really positive reputation for being particularly helpful to other members, or for having really strong knowledge in a specific area. [\[TWEET\] Dinner with some of the Traffic Think Tank members in Brighton, UK](https://twitter.com/matthewbarby/status/1117175584080134149) Nick, Ian and I would go out of our way to try and meet with members in real life wherever we could. I was taken aback by how appreciative people were for us doing this, and it also served as an invaluable way to gain honest feedback from members. There was another trend that we’d observed that we didn’t really expect to happen. More and more members were doing business with each another. We’ve had people find new jobs through the community, sell businesses to other members, launch joint ventures together and bring members in as consultants to their business. This has probably been the most rewarding thing to watch, and it was clear that the deeper relationships that our members were forming were resulting in an increased level of trust to work with each other. We wanted to harness this and take it to a new level. This brought us to arguably the best decision we’ve made so far running Traffic Think Tank… we were going to run a big live event for our members. I have no idea what I’m doing It’s the first week of January 2019 and we’re less than three weeks away from Traffic Think Tank LIVE, our first ever in-person event hosting 150 people, most of which are Traffic Think Tank members. It's like an ongoing nightmare I can’t wake up from. That was Nick’s response in our private admin channel to myself and Ian when I asked if they were finding the run-up to the event as stressful as I was. I think that all three of us were riding on such a high from how the community was growing that we felt like we could do anything. Running an event? How hard can it be? Well, turns out it’s really hard. We had seven different speakers flying over from around the world to speak at the event, there was a pre- and after event party, and we’d planned a charity dinner where we would take ten attendees (picked at random via a raffle) out for a fancy meal. Oh, and Nick, Ian and I were hosting a live Q&A session on stage. It wasn’t until precisely 48 hours before the event that we’d realized we didn’t have any microphones, nor had a large amount of the swag we’d ordered arrived. Plus, a giant storm had hit Philly causing a TON of flight cancellations. Perfect. Just perfect. This was honestly the tip of the iceberg. We hadn’t thought about who was going to run the registration desk, who would be taking photos during the event and who would actually field questions from the audience while all three of us sat on stage for our live Q&A panel. Turns out that the answer to all of those questions were my wife, Laura, and Nick’s wife, Kelley. Thankfully, they were on hand to save our asses. The weeks running up to the event were honestly some of the most stressful of my life. We sold around 50% of our ticket allocation within the final two weeks before the event. All of the event organizers told us this would happen, but did we believe them? Hell no!  Imagine having two weeks until the big day and as it stood half of the room would be completely empty. I was ready to fly most of my extended family over just to make it look remotely busy. [\[IMAGE\] One of our speakers, Ryan Stewart, presenting at Traffic Think Tank LIVE](https://cdn.shortpixel.ai/client/qglossy,retimg,w_1920/https://www.matthewbarby.com/wp-content/uploads/2019/08/Traffic-Think-Tank-LIVE-Ryan-Presenting.jpg) Thankfully, if all came together. We managed to acquire some microphones, the swag arrived on the morning of the event, all of our speakers were able to make it on time and the weather just about held up so that our entire allocation of ticket holders was able to make it to the event. We pooled together and I’m proud to say that the event was a huge success. While we made a substantial financial loss on the event itself, January saw a huge spike in new members, which more than recouped our losses. Not only that, but we got to hang out with a load of our members all day while they said really nice things about the thing we’d built. It was both exhausting and incredibly rewarding. Bring on Traffic Think Tank LIVE 2020! (This time we’re hiring an event manager...)   The road ahead Fast forward to today (August 2019) and Traffic Think Tank has over 650 members. The biggest challenges that we’re tackling right now include making sure the most interesting conversations and best content surfaces to the top of the community, making Slack more searchable (this is ultimately one of its flaws as a platform) and giving members a quicker way to find the exclusive content that we create. You’ll notice there’s a pretty clear theme here. In the past 30 days, 4,566 messages were posted in public channels inside Traffic Think Tank. If you add on any messages posted inside private direct messages, this number rises to 21,612. That’s a lot of messages. To solve these challenges and enable further scale in the future, we’ve invested a bunch of cash and our time into building out a full learning management system (LMS) that all members will get access to alongside the Slack community. The LMS will be a web-based portal that houses all of the video content we produce. It will also  provide an account admin section where users can update or change their billing information (they have to email us to do this right now, which isn’t ideal), a list of membership perks and discounts with our partners, and a list of links to some of the best threads within Slack – when clicked, these will drop you directly into Slack. [\[IMAGE\] Designs for the new learning management system (LMS)](https://cdn.shortpixel.ai/client/qglossy,retimg,w_2378/https://www.matthewbarby.com/wp-content/uploads/2019/08/Traffic-Think-Tank-LMS.png) It’s not been easy, but we’re 95% of the way through this and I’m certain that it will have a hugely positive impact on the experience for our members. Alongside this we hired a community manager, Liz, who supports with any questions that our members have, coordinates with external experts to arrange webinars for the community, helps with new member onboarding, and has tightened up some of our processes around billing and general accounts admin. This was a great decision. Finally, we’ve started planning next year’s live event, which we plan to more than double in size to 350 attendees, and we decided to pick a slightly warmer location in Miami this time out. Stay tuned for me to have a complete meltdown 3 weeks from the event. Final thoughts When I look back on the journey we’ve had so far building Traffic Think Tank, there’s one very important piece to this puzzle that’s made all of this work that I’ve failed to mention so far: co-founder alignment. Building a community is a balancing act that relies heavily on those in charge being completely aligned. Nick, Ian and I completely trust each other and more importantly, are philosophically aligned on how we want to run and grow the community. If we didn’t have this, the friction between us could tear apart the entire community. Picking the right people to work with is important in any company, but when your business is literally about bringing people together, there’s no margin for error here.  While I’m sure there will be many more challenges ahead, knowing that we all trust each other to make decisions that fall in line with each of our core values makes these challenges dramatically easier to overcome. Finally, I’d like to thank all of our members for making the community what it is today – it’d be nothing without you and I promise that we’ll never take that for granted. ​ I originally posted this on my blog here. Welcoming all of your thoughts, comments, questions and I'll do my best to answer them :)

10 Side Projects in 10 Years: Lessons from Failures and a $700 Exit
reddit
LLM Vibe Score0
Human Vibe Score1
TheValueProviderThis week

10 Side Projects in 10 Years: Lessons from Failures and a $700 Exit

Hey folks, I'm sharing my journey so far in case it can help others. Entrepreneurship can sometimes be demotivating. In my case, I've always been involved in side projects and what I've realized is that every time you crash a project, the next one makes it a bit further. So this is a long-term game and consistency ends up paying off The $1 Android Game (2015, age 18) What Happened: 500 downloads, 1€ in ad revenue Ugly UI, performance issues Key Lessons: Don’t be afraid of launching. Delaying for “perfection” is often a sign that you fear being ignored. I was trying to perfect every aspect of the game. In reality, I was delaying the launch because I feared no one would download the app. Commit to the project or kill it. At some point, this project was no longer fun (it was just about fixing device responsiveness). Most importantly, I wasn't learning anything new so I moved to smth else. The Forex Bot Regret (2016, age 19) What Happened: Lost months identifying inexistent chart patterns Created a Trading bot that was never profitable Key Lessons: Day trading’s real winners are usually brokers. There are plenty of guys selling a bot or systems that are not making money trading, why would they sell a “money-printing machine” otherwise... Develop an unfair advantage. With these projects, I developed a strong coding foundation that gave me an edge when dealing with non-technical business people. Invest countless hours to create a skills gap between you and others, one that becomes increasingly difficult for them to close (coding, public speaking, networking, etc.) The $700 Instagram Exit (2018, age 21) What Happened: Grew a motivational account to 60k followers Sold it for $700 90% of followers were in low-income countries (hard to monetize) Key Lessons: Follower quality > quantity. I focused on growth and ended up with an audience I couldn’t truly define. If brands don’t see value, you won’t generate revenue. Also, if you do not know who you are creating content for, you'll end up demotivated and stop posting. Great 3rd party product + domain authority = Affiliate marketing works. In this case, I could easily promote an IG growing service because my 50k+ followers conveyed trust. Most importantly, the service I was promoting worked amazingly. The Illegal Amazon Review Marketplace (2020, age 23) What Happened: Sellers were reimbursing buyers for positive reviews Built a WordPress marketplace to facilitate “free products for reviews” Realized it violated Amazon’s terms Key Lessons: Check for “red flags” when doing idea assessment. There will always be red and orange flags. It’s about learning to differentiate between them (e.g. illegality, 100% dependence on a platform, etc.) If there’s competition, it’s good, if they are making money it’s even better. I was thrilled when I saw no competition for my “unique idea”. Later, I discovered the obvious reason. Copying a “Proven” Business Model (2020, age 23) What Happened: Tried recreating an Instagram “comment for comment” growth tool Instagram changed the algorithm and killed the growth strategy that the product used. Key Lessons: Do not build a business that depends 100% on another business, it is too risky. Mr. Musk can increase Twitter on API pricing to $42,000 monthly without notice and Tik Tok can be banned in the US. Due to the IG algorithm change, we had built a product that was not useful, and worse, now we had no idea how to grow an IG account. Consider future project synergies before selling. I regret having sold the 60k follower IG account since it could have saved me a lot of time when convincing users to try the service. NFT Marathon Medals (2021, age 24) What Happened: Created NFT race medals Sold 20 for 5€ each, but spent 95% of meetings explaining “what is an NFT?” Key Lessons: Market timing is crucial. As with every new technology, it is only useful as long as society is ready to adopt it. No matter how promising the tech is in the eyes of SV, society will end up dictating its success (blockchain, AI, etc). In this case, the runner community was not ready to adopt blockchain (it is not even prepared today). Race organizers did not know what they were selling, and runners did not know what they were buying. The 30-day rule in Fanatical Prospecting. Do not stop prospecting. I did prospecting and closed deals 3 months after the outbound efforts. Then I was busy executing the projects and had no clients once the projects were finished. AI Portal & Co-Founder Misalignment (2023, age 26) What Happened: Built a portal for SMEs to find AI use cases Co-founders disagreed on vision and execution Platform still gets \~1 new user/day Key Lessons: Define roles and equity clearly. Our biggest strength ended up killing us. Both founders had strong strategic skills and we were constantly arguing about decisions. NextJS + Vercel + Supabase: Great stack to create a SaaS MVP. (but do not use AI with frameworks unless you know how they work conceptually) SEO is king. One of our users creates a use case on “Changing Song Lyrics with AI.” Not being our target use case, it brings 90% of our traffic. Building an AI Tool & Getting Ghosted (2024, age 27) What Happened: SEO agency wanted to automate rewriting product descriptions Built it in 3 weeks, but the client vanished Key Lessons: Validate manually first. Don’t code a full-blown solution for a problem you haven’t tested in real-world workflows. I kept rewriting code only to throw it away. Jumping straight into building a solution ended up costing more time than it saved. Use templates, no-code, and open-source for prototyping. In my case, using a Next.js template saved me about four weeks of development only to hit the same dead end, but much faster. Fall in love with your ICP or walk away. I realized I didn’t enjoy working with SEO agencies. Looking back, I should have been honest with myself and admitted that I wasn’t motivated enough by this type of customer. Ignoring Code Perfection Doubled Traffic (2025, age 28) What Happened: Partnered with an ex-colleague to build an AI agents directory Focused on content & marketing, not endless bug fixes Traffic soared organically Key Lessons: Measure the impact of your actions and double down on what works. We set up an analytics system with PostHog and found wild imbalances (e.g. 1 post about frameworks outperformed 20 promotional posts). You have to start somewhere. For us, the AI agents directory is much more than just a standalone site, it's a strategic project that will allow us to discover new products, gain domain authority, and boost other projects. It builds the path for bigger opportunities. Less coding, more traction. Every day I have to fight against myself not to code “indispensable features”. Surprisingly, the directory keeps gaining consistent traffic despite being far from perfect Quitting My Job & Looking Ahead (2025, age 28) What Happened: Left full-time work to go all-in Plan to build vertical AI agents that handle entire business workflows (support, marketing, sales) Key Lessons: Bet on yourself. The opportunity cost of staying in my full-time job outweighed the benefits. It might be your case too I hope this post helps anyone struggling with their project and inspires those considering quitting their full-time job to take the leap with confidence.

I Watched My Startup Slowly Dying Over Two Years: Mistakes and Lessons Learned
reddit
LLM Vibe Score0
Human Vibe Score0.429
Personal-Expression3This week

I Watched My Startup Slowly Dying Over Two Years: Mistakes and Lessons Learned

If you are tired of reading successful stories, you may want to listen to my almost failure story. Last year in April, I went full-time on my startup. Nearly two years later, I’ve seen my product gradually dying. I want to share some of the key mistakes I made and the lessons I’ve taken from them so you don't have to go through them. Some mistakes were very obvious in hindsight; others, I’m still not sure if they were mistakes or just bad luck. I’d love to hear your thoughts and advice as well. Background I built an English-learning app, with both web and mobile versions. The idea came from recognizing how expensive it is to hire an English tutor in most countries, especially for practicing speaking skills. With the rise of AI, I saw an opportunity in the education space. My target market was Japan, though I later added support for multiple languages and picked up some users from Indonesia and some Latin American countries too. Most of my users came from influencer marketing on Twitter. The MVP for the web version launched in Japan and got great feedback. People were reposting it on Twitter, and growth was at its peak in the first few weeks. After verifying the requirement with the MVP, I decided to focus on the mobile app to boost user retention, but for various reasons, the mobile version didn’t launch until December 2023— 8 months after the web version. Most of this year has been spent iterating on the mobile app, but it didn’t make much of an impact in the end. Key Events and Lessons Learned Here are some takeaways: Find co-founders as committed as you are I started with two co-founders—both were tech people and working Part-Time. After the web version launched, one dropped out due to family issues. Unfortunately, we didn’t set clear rules for equity allocation, so even after leaving, they still retained part of the equity. The other co-founder also effectively dropped out this year, contributing only minor fixes here and there. So If you’re starting a company with co-founders, make sure they’re as committed as you are. Otherwise, you might be better off going solo. I ended up teaching myself programming with AI tools, starting with Flutter and eventually handling both front-end and back-end work using Windsurf. With dev tools getting more advanced, being a solo developer is becoming a more viable option. Also, have crystal-clear rules for equity—especially around what happens if someone leaves. Outsourcing Pitfalls Outsourcing development was one of my biggest mistakes. I initially hired a former colleague from India to build the app. He dragged the project on for two months with endless excuses, and the final output was unusable. Then I hired a company, but they didn’t have enough skilled Flutter developers. The company’s owner scrambled to find people, which led to rushed work and poor-quality code which took a lot of time revising myself. Outsourcing is a minefield. If you must do it, break the project into small tasks, set clear milestones, and review progress frequently. Catching issues early can save you time and money. Otherwise, you’re often better off learning the tools yourself—modern dev tools are surprisingly beginner-friendly. Trust, but Verify I have a bad habit of trusting people too easily. I don’t like spending time double-checking things, so I tend to assume people will do what they say they’ll do. This mindset is dangerous in a startup. For example, if I had set up milestones and regularly verified the progress of my first outsourced project, I would’ve realized something was wrong within two weeks instead of two months. That would’ve saved me a lot of time and frustration. Like what I mentioned above, set up systems to verify their work—milestones, deliverables, etc.—to minimize risk. Avoid red ocean if you are small My team was tiny (or non-existent, depending on how you see it), with no technical edge. Yet, I chose to enter Japan’s English-learning market, which is incredibly competitive. It’s a red ocean, dominated by big players who’ve been in the game for years. Initially, my product’s AI-powered speaking practice and automatic grammar correction stood out, but within months, competitors rolled out similar features. Looking back, I should’ve gone all-in on marketing during the initial hype and focused on rapidly launching the mobile app. But hindsight is 20/20. 'Understanding your user' helps but what if it's not what you want? I thought I was pretty good at collecting user feedback. I added feedback buttons everywhere in the app and made changes based on what users said. But most of these changes were incremental improvements—not the kind of big updates that spark excitement. Also, my primary users were from Japan and Indonesia, but I’m neither Japanese nor Indonesian. That made it hard to connect with users on social media in an authentic way. And in my opinion, AI translations can only go so far—they lack the human touch and cultural nuance that builds trust. But honestly I'm not sure if the thought is correct to assume that they will not get touched if they recognize you are a foreigner...... Many of my Japanese users were working professionals preparing for the TOEIC exam. I didn’t design any features specifically for that; instead, I aimed to build a general-purpose English-learning tool since I dream to expand it to other markets someday. While there’s nothing wrong with this idealistic approach, it didn’t give users enough reasons to pay for the app. Should You Go Full-Time? From what I read, a lot of successful indie developers started part-time, building traction before quitting their jobs. But for me, I jumped straight into full-time mode, which worked for my lifestyle but might’ve hurt my productivity. I value work-life balance and refused to sacrifice everything for the startup. The reason I chose to leave the corp is I want to escape the 996 toxic working environment in China's internet companies. So even during my most stressful periods, I made time to watch TV with my partner and take weekends off. Anyways, if you’re also building something or thinking about starting a business, I hope my story helps. If I have other thoughts later, I will add them too. Appreciate any advice.

I tested hundreds of marketing tools in the last three years and these 50 made it to the list. I'll sum up my top 50 marketing tools with one or two sentences + give you pricings.
reddit
LLM Vibe Score0
Human Vibe Score1
SpicyCopyThis week

I tested hundreds of marketing tools in the last three years and these 50 made it to the list. I'll sum up my top 50 marketing tools with one or two sentences + give you pricings.

Hey guys, I'm working in a growth marketing agency. Marketing tools are 30% of what we do, so we use them a lot and experiment with the new ones as much as possible. There are thousands of tools and it's easy to get lost, so I wanted to share the tools we use most on a daily basis. And divide the list into 14 categories. I thought this could be handy for Entrepreneurs subreddit. Why adopt tools? I see marketing tools as tireless colleagues. If you can't hire an employee, choosing the right tool can solve your problems, because they Are super cheap. Work 7/24 for you. Don’t make mistakes. Don’t need management. (or needless management) Help you to automate the majority of your lead gen process. Onwards to the list. (With the pricings post ended up quite long, you can find a link in the end if you want to check the prices) Email marketing tools #1 ActiveCampaign is armed with the most complicated email automation features and has the most intuitive user experience. It feels like you already know how to use it. \#2 Autopilot is visual marketing automation and customer journey tool that helps you acquire, nurture based on behaviors, interest etc. #3 Mailjet: This is the tool we use to send out bulky email campaigns such as newsletters. It doesn't have sexy features like others but does its job for a cheap price. Email address finders #4 Skrapp finds email of your contacts by name and company. It also works with LinkedIn Sales Navigator and can extract thousands of emails in bulk + have a browser add-on. #5 Hunter: Similar to Skrapp but doesn't work with LinkedIn Sales Navigator directly. In addition, there are email templates and you can set up email campaigns. Prospecting and outreach tools #6 Prospect combines the personal emails, follow-up calls, other social touches and helps you create multichannel campaigns.  #7 Reply is a more intuitive version of Prospect. It is easy to learn and use; their UX makes you feel good and sufficient.  CRM tools #8 Salesflare helps you to stop managing your data and start managing your customers. Not yet popular as Hubspot and etc but the best solution for smaller B2B businesses. (we're fans) \#9 Hubspot: The most popular CRM for good reason and has a broader product range you can adopt in your next steps. Try this if you have a bulky list of customers because it is free. #10 Pardot: Pardot is by Salesforce, it's armed with features that can close the gap between marketing and sales. Sales Tools #11 Salesforce is the best sales automation and lead management software. It helps you to create complicated segmentations and run, track, analyze campaigns from the same dashboard. #12 LinkedIn Sales Navigator gives you full access to LinkedIn's user database. You can even find a kidnapped CEO if you know how to use it with other marketing automation tools like Skrapp. #13 Pipedrive is a simple tool and excels in one thing. It tracks your leads and tells you when to take the next action. It makes sales easier. #14 Qwilr creates great-looking docs, at speed. You can design perfect proposals, quotes, client updates, and more in a flash. We use it a lot to close deals, it's effective. #15 Crystalknows is an add-on that tells you anyone’s personality on LinkedIn and gives you a detailed approach specific to that person. It's eerily accurate. #16 Leadfeeder shows you the companies that visited your website. Tells how they found you and what they’re interested in. It has a free version. Communication Tools #17 Intercom is a sweet and smart host that welcomes your visitors when you’re not home. It’s one of the best chatbot tools in the market. #18 Drift is famous for its conversational marketing features and more sales-focused than Intercom. #19 Manychat is a chatbot that helps you create high converting Facebook campaigns. #20 Plann3r helps you create your personalized meeting page. You can schedule meetings witch clients, candidates, and prospects. #21 Loom is a video messaging tool, it helps you to be more expressive and create closer relationships. #22 Callpage collects your visitors’ phone number and connects you with them in seconds. No matter where you are. Landing page tools #23 Instapage is the best overall landing page builder. It has a broad range of features and even squirrel can build a compelling landing page with templates. No coding needed. #24 Unbounce can do everything that Instapage does and lets you build a great landing page without a developer. But it's less intuitive. Lead generation / marketing automation tools #25 Phantombuster is by far the most used lead generation software in our tool kit. It extracts data, emails, sends requests, customized messages, and does many things on autopilot in any platform. You can check this, this and this if you want to see it in action. #26 Duxsoup is a Google Chrome add-on and can also automate some of LinkedIn lead generation efforts like Phantombuster. But not works in the cloud. #27 Zapier is a glue that holds all the lead generation tools together. With Zapier, You can connect different marketing tools and no coding required. Conversion rate optimization tools #28 Hotjar tracks what people are doing on your website by recording sessions and capturing mouse movements. Then it gives you a heatmap. #29 UsabilityHub shows your page to a digital crowd and measures the first impressions and helps you to validate your ideas. #30 Optinmonster is a top tier conversion optimization tool. It helps you to capture leads and enables you to increase conversions rates with many features. #31 Notifia is one mega tool of widgets that arms your website with the wildest social proof and lead capturing tactics. #32 Sumo is a much simpler version of Notifia. But Sumo has everything to help you capture leads and build your email lists. Web scrapers #33 Data Miner is a Google Chrome browser extension that helps you scrape data from web pages and into a CSV file or Excel spreadsheet. #34 Webscraper does the same thing as Data Miner; however, it is capable of handling more complex tasks. SEO and Content #35 Grammarly: Your English could be your first language and your grammar could be better than Shakespeare. Grammarly still can make your writing better. #36 Hemingwayapp is a copywriting optimization tool that gives you feedback about your copy and improves your readability score, makes your writing bolder and punchier. Free. #37 Ahrefs is an all-rounder search engine optimization tool that helps you with off-page, on-page or technical SEO. #38 SurferSEO makes things easier for your on-page SEO efforts. It’s a tool that analyzes top Google results for specific keywords and gives you a content brief based on that data. Video editing and design tools #39 Canva is a graphic design platform that makes everything easy. It has thousands of templates for anything from Facebook ads, stylish presentations to business cards.  #40 Kapwing is our go-to platform for quick video edits. It works on the browser and can help you to create stylish videos, add subtitles, resize videos, create memes, or remove backgrounds. #41 Animoto can turn your photos and video clip into beautiful video slideshows. It comes handy when you want to create an advertising material but don’t have a budget. Advertising tools #42 AdEspresso lets you create and test multiple ads with few clicks. You can optimize your FB, IG, and Google ads from this tool and measure your ads with in-depth analytics. #43 AdRoll is an AI-driven platform that connects and coordinates marketing efforts across ads, email, and online stores. Other tools #44 Replug helps you to shorten, track, optimize your links with call-to-actions, branded links, and retargeting pixels #45 Draw.io = Mindmaps, schemes, and charts. With Draw.io, you can put your brain in a digital paper in an organized way. #46 Built With is a tool that finds out what websites are built with. So you can see what tools they're using and so on. #47 Typeform can turn data collection into an experience with Typeform. This tool helps you to engage your audience with conversational forms or surveys and help you to collect more data. #48 Livestorm helped us a lot, especially in COVID-19 tiles. It’s a webinar software that works on your browser, mobile, and desktop. #49 Teachable \- If you have an online course idea but hesitating because of the production process, Teachable can help you. It's easy to configure and customizable for your needs. #50 Viral Loops provides a revolutionary referral marketing solution for modern marketers. You can create and run referral campaigns in a few clicks with templates. Remember, most of these tools have a free trial or free version. Going over them one by one can teach you a lot and help you grow your business with less work power in the early stages of your business. I hope you enjoyed the read and can find some tools to make things easier! Let me know about your favorite tools in the comments, so I can try them out. \------ If you want to check the prices and see a broader explanation about the tools, you can go here.

How to increase the sales of my book
reddit
LLM Vibe Score0
Human Vibe Score1
danonino80This week

How to increase the sales of my book

In just 3 months, it generated over $100 in revenue. I wanted to share my journey for two reasons: to potentially assist others in self-publishing their own books and to receive feedback to enhance my marketing strategy. I envision that there are others facing similar challenges. Let's dive into the financials, time spent, Key takeaways and the Challenges to address behind this product. Finances First, let's take a look at the financial overview. 💳 Expenses 🔹 E-book creation: · Book cover: $ 0. I used Adobe Express with 30 days of free trial. · ChatGPT: 20 $ a month. I leveraged AI to generate the chapters of the book, ensuring that no critical topics were overlooked during the content creation process and to refine the English, as it's not my native language. I also used to help me with copywriting of the web. If anyone is interested, I can share my Python code for outlining the chapters calling the API, but you can also directly ask chatgpt. · Kindle KDP (Kindle Direct Publishing): order author copies: 10 $. 🔹 Web creation: Domain: I got a com) / .org /.net domain for just 1 $ the first year. Carrd.co subscription: 19 $ (1 year) 🔹 Marketing: Promoted post on reddit: $30 Paid ads with google ads: $30 💰 Revenue 🔸 Sales: $102 💸 Net Profit: \~- $ 18 I initially thought the sales for this e-book would be quite modest, maybe only 3 or 4 books. However, the fact that I've sold more than that so far is a pleasant surprise. Even though the overall numbers may still be considered "peanuts" in the grand scheme of book sales, it suggests there could be more demand for content on digital asset custody than I had originally anticipated. This is a good learning experience, and I'll look to refine my marketing approach to see if I can reach a wider audience interested in this topic 🔹 Time Spent Next, let's review the time invested. 📖 Writing the e-book: 40 hours 🌍 Website + Stripe integration: 10 hours 📣 Creating promotional content: 10 hours ⏱️ Additional marketing efforts: 5 hours Total time spent: 65 hours As you can see, I dedicated more time to writing the e-book itself than to marketing and distribution. I spent relevant time to marketing because I though that a successful product launch requires a robust marketing effort. Many e-book authors overlook this crucial aspect! I utilized three sales channels: · Amazon: I found that there were no books specifically about digital asset custody, resulting in strong positioning in Amazon searches. Additionally, my book immediately secured the top position in Google searches for "digital asset custody book." However, despite achieving 50% of sales in the UK, I have not received any reviews globally. Sales distribution for this channel: 20% physical book, 80% ebook. · Twitter: Daniel\_ZZ80. With only 46 followers, the performance on this platform has not been optimal. I am beginning to write posts related to digital assets to increase visibility. · Gumroad: Lockeyyy.gumroad.com. I offered a discounted version of the ebook, but have not yet made any sales through this channel. Key takeaways: · The process of creating this e-book was extremely fulfilling, and while it has garnered overwhelmingly positive feedback from friends and colleagues (not considered as sales), it has yet to receive any Amazon reviews ☹. · Kindle KDP proved to be ideal for a rapid go-to-market strategy. · AI is an excellent tool for generating ideas and providing access to global audiences with perfect grammar. Otherwise, I would need to hire a translator, which can be very expensive. · Despite offering a full 30-day money-back guarantee, leading me to believe that the quality of the content is indeed good. · I have gained valuable insights for future technical books. · Although the current financial balance may be negative, I anticipate reaching the break-even point within one month, and this has now become a passive income stream. However, I recognize the need to regularly update the content due to the rapidly changing nature of this field. Challenges to address: · Is the timing for launching this book appropriate? In other words, is the world of digital asset custody a trendy and interesting topic for the audience? · What is causing the lack of sales through Gumroad? · Should I seek assistance as my marketing efforts have not yielded results? · Why are there no reviews on Amazon? · Why are sales primarily concentrated in the EU with only one sale in the US, which is my main target market? Feedback is appreciated. If you're interested in learning more about my approach, feel free to send me a direct message. A bit about my background: After dedicating my entire career to the banking industry, I explored various side projects. As an IT professional, I have now transitioned into the digital asset realm. After three years of intensive study, I recently published my first book on digital asset custody. I hope you found this post informative. Cheers! P.S.: I'm currently in the process of launching two more books using this system. 😊

How to increase the sales of my book
reddit
LLM Vibe Score0
Human Vibe Score1
danonino80This week

How to increase the sales of my book

In just 3 months, it generated over $100 in revenue. I wanted to share my journey for two reasons: to potentially assist others in self-publishing their own books and to receive feedback to enhance my marketing strategy. I envision that there are others facing similar challenges. Let's dive into the financials, time spent, Key takeaways and the Challenges to address behind this product. Finances First, let's take a look at the financial overview. 💳 Expenses 🔹 E-book creation: · Book cover: $ 0. I used Adobe Express with 30 days of free trial. · ChatGPT: 20 $ a month. I leveraged AI to generate the chapters of the book, ensuring that no critical topics were overlooked during the content creation process and to refine the English, as it's not my native language. I also used to help me with copywriting of the web. If anyone is interested, I can share my Python code for outlining the chapters calling the API, but you can also directly ask chatgpt. · Kindle KDP (Kindle Direct Publishing): order author copies: 10 $. 🔹 Web creation: Domain: I got a com) / .org /.net domain for just 1 $ the first year. Carrd.co subscription: 19 $ (1 year) 🔹 Marketing: Promoted post on reddit: $30 Paid ads with google ads: $30 💰 Revenue 🔸 Sales: $102 💸 Net Profit: \~- $ 18 I initially thought the sales for this e-book would be quite modest, maybe only 3 or 4 books. However, the fact that I've sold more than that so far is a pleasant surprise. Even though the overall numbers may still be considered "peanuts" in the grand scheme of book sales, it suggests there could be more demand for content on digital asset custody than I had originally anticipated. This is a good learning experience, and I'll look to refine my marketing approach to see if I can reach a wider audience interested in this topic 🔹 Time Spent Next, let's review the time invested. 📖 Writing the e-book: 40 hours 🌍 Website + Stripe integration: 10 hours 📣 Creating promotional content: 10 hours ⏱️ Additional marketing efforts: 5 hours Total time spent: 65 hours As you can see, I dedicated more time to writing the e-book itself than to marketing and distribution. I spent relevant time to marketing because I though that a successful product launch requires a robust marketing effort. Many e-book authors overlook this crucial aspect! I utilized three sales channels: · Amazon: I found that there were no books specifically about digital asset custody, resulting in strong positioning in Amazon searches. Additionally, my book immediately secured the top position in Google searches for "digital asset custody book." However, despite achieving 50% of sales in the UK, I have not received any reviews globally. Sales distribution for this channel: 20% physical book, 80% ebook. · Twitter: Daniel\_ZZ80. With only 46 followers, the performance on this platform has not been optimal. I am beginning to write posts related to digital assets to increase visibility. · Gumroad: Lockeyyy.gumroad.com. I offered a discounted version of the ebook, but have not yet made any sales through this channel. Key takeaways: · The process of creating this e-book was extremely fulfilling, and while it has garnered overwhelmingly positive feedback from friends and colleagues (not considered as sales), it has yet to receive any Amazon reviews ☹. · Kindle KDP proved to be ideal for a rapid go-to-market strategy. · AI is an excellent tool for generating ideas and providing access to global audiences with perfect grammar. Otherwise, I would need to hire a translator, which can be very expensive. · Despite offering a full 30-day money-back guarantee, leading me to believe that the quality of the content is indeed good. · I have gained valuable insights for future technical books. · Although the current financial balance may be negative, I anticipate reaching the break-even point within one month, and this has now become a passive income stream. However, I recognize the need to regularly update the content due to the rapidly changing nature of this field. Challenges to address: · Is the timing for launching this book appropriate? In other words, is the world of digital asset custody a trendy and interesting topic for the audience? · What is causing the lack of sales through Gumroad? · Should I seek assistance as my marketing efforts have not yielded results? · Why are there no reviews on Amazon? · Why are sales primarily concentrated in the EU with only one sale in the US, which is my main target market? Feedback is appreciated. If you're interested in learning more about my approach, feel free to send me a direct message. A bit about my background: After dedicating my entire career to the banking industry, I explored various side projects. As an IT professional, I have now transitioned into the digital asset realm. After three years of intensive study, I recently published my first book on digital asset custody. I hope you found this post informative. Cheers! P.S.: I'm currently in the process of launching two more books using this system. 😊

voicefilter
github
LLM Vibe Score0.496
Human Vibe Score0.029786815978503328
maum-aiMar 24, 2025

voicefilter

VoiceFilter Note from Seung-won (2020.10.25) Hi everyone! It's Seung-won from MINDs Lab, Inc. It's been a long time since I've released this open-source, and I didn't expect this repository to grab such a great amount of attention for a long time. I would like to thank everyone for giving such attention, and also Mr. Quan Wang (the first author of the VoiceFilter paper) for referring this project in his paper. Actually, this project was done by me when it was only 3 months after I started studying deep learning & speech separation without a supervisor in the relevant field. Back then, I didn't know what is a power-law compression, and the correct way to validate/test the models. Now that I've spent more time on deep learning & speech since then (I also wrote a paper published at Interspeech 2020 😊), I can observe some obvious mistakes that I've made. Those issues were kindly raised by GitHub users; please refer to the Issues and Pull Requests for that. That being said, this repository can be quite unreliable, and I would like to remind everyone to use this code at their own risk (as specified in LICENSE). Unfortunately, I can't afford extra time on revising this project or reviewing the Issues / Pull Requests. Instead, I would like to offer some pointers to newer, more reliable resources: VoiceFilter-Lite: This is a newer version of VoiceFilter presented at Interspeech 2020, which is also written by Mr. Quan Wang (and his colleagues at Google). I highly recommend checking this paper, since it focused on a more realistic situation where VoiceFilter is needed. List of VoiceFilter implementation available on GitHub: In March 2019, this repository was the only available open-source implementation of VoiceFilter. However, much better implementations that deserve more attention became available across GitHub. Please check them, and choose the one that meets your demand. PyTorch Lightning: Back in 2019, I could not find a great deep-learning project template for myself, so I and my colleagues had used this project as a template for other new projects. For people who are searching for such project template, I would like to strongly recommend PyTorch Lightning. Even though I had done a lot of effort into developing my own template during 2019 (VoiceFilter -> RandWireNN -> MelNet -> MelGAN), I found PyTorch Lightning much better than my own template. Thanks for reading, and I wish everyone good health during the global pandemic situation. Best regards, Seung-won Park Unofficial PyTorch implementation of Google AI's: VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking. Result Training took about 20 hours on AWS p3.2xlarge(NVIDIA V100). Audio Sample Listen to audio sample at webpage: http://swpark.me/voicefilter/ Metric | Median SDR | Paper | Ours | | ---------------------- | ----- | ---- | | before VoiceFilter | 2.5 | 1.9 | | after VoiceFilter | 12.6 | 10.2 | SDR converged at 10, which is slightly lower than paper's. Dependencies Python and packages This code was tested on Python 3.6 with PyTorch 1.0.1. Other packages can be installed by: Miscellaneous ffmpeg-normalize is used for resampling and normalizing wav files. See README.md of ffmpeg-normalize for installation. Prepare Dataset Download LibriSpeech dataset To replicate VoiceFilter paper, get LibriSpeech dataset at http://www.openslr.org/12/. train-clear-100.tar.gz(6.3G) contains speech of 252 speakers, and train-clear-360.tar.gz(23G) contains 922 speakers. You may use either, but the more speakers you have in dataset, the more better VoiceFilter will be. Resample & Normalize wav files First, unzip tar.gz file to desired folder: Next, copy utils/normalize-resample.sh to root directory of unzipped data folder. Then: Edit config.yaml Preprocess wav files In order to boost training speed, perform STFT for each files before training by: This will create 100,000(train) + 1000(test) data. (About 160G) Train VoiceFilter Get pretrained model for speaker recognition system VoiceFilter utilizes speaker recognition system (d-vector embeddings). Here, we provide pretrained model for obtaining d-vector embeddings. This model was trained with VoxCeleb2 dataset, where utterances are randomly fit to time length [70, 90] frames. Tests are done with window 80 / hop 40 and have shown equal error rate about 1%. Data used for test were selected from first 8 speakers of VoxCeleb1 test dataset, where 10 utterances per each speakers are randomly selected. Update: Evaluation on VoxCeleb1 selected pair showed 7.4% EER. The model can be downloaded at this GDrive link. Run After specifying traindir, testdir at config.yaml, run: This will create chkpt/name and logs/name at base directory(-b option, . in default) View tensorboardX Resuming from checkpoint Evaluate Possible improvments Try power-law compressed reconstruction error as loss function, instead of MSE. (See #14) Author Seungwon Park at MINDsLab (yyyyy@snu.ac.kr, swpark@mindslab.ai) License Apache License 2.0 This repository contains codes adapted/copied from the followings: utils/adabound.py from https://github.com/Luolc/AdaBound (Apache License 2.0) utils/audio.py from https://github.com/keithito/tacotron (MIT License) utils/hparams.py from https://github.com/HarryVolek/PyTorchSpeakerVerification (No License specified) utils/normalize-resample.sh from https://unix.stackexchange.com/a/216475

Overmind
github
LLM Vibe Score0.469
Human Vibe Score0.20474237922306593
bencbartlettMar 23, 2025

Overmind

[](https://github.com/bencbartlett/Overmind/releases) [](https://github.com/bencbartlett/Overmind/blob/master/CHANGELOG.md) [](https://bencbartlett.github.io/overmind-docs/) [](https://github.com/bencbartlett/Overmind/wiki) [](https://screeps.slack.com/messages/overmind) [](https://github.com/bencbartlett/Overmind/issues/new) [](https://github.com/bencbartlett/Overmind/issues/new?template=feature_request.md) Current release: Overmind v0.5.2 - Evolution See the changelog for patch notes Documentation is available at the documentation site and the wiki Join the discussion in the #overmind Slack channel! Read blog posts about development Submit an issue here or request a feature here Find me in game here About Overmind What is Screeps? Screeps is an MMO strategy game for programmers. The core objective is to expand your colony, gathering resources and fighting other players along the way. To control your units, you code an AI in JavaScript; everything from moving, mining, building, fighting, and trading is entirely driven by your code. Because Screeps is an MMO, it takes place on a single server that runs 24/7, populated by every other player and their army of creeps. When you log off, your population continues buzzing away with whatever task you set them. Screeps pits your programming prowess head-to-head with other people to see who can think of the most efficient methods of completing tasks or imagine new ways to defeat enemies. What is Overmind? Overmind is my personal codebase that I run on the public server. The structure of the AI is themed loosely around the Zerg's swarm intelligence from Starcraft. Overlords orchestrate Creep actions within each Colony, and the colony Overseer places Directives to adapt to stimuli. Finally, the Assimilator allows all players running Overmind to act as a collective hivemind, sharing creeps and resources and responding jointly to a master ledger of all directives shared by all players. The AI is entirely automated, although it can also run in manual or semiautomatic mode. The latest release should work right out of the box; however, if you find something broken, please submit an issue and I'll try to fix it. Can I use Overmind as my bot? If you're new to Screeps, I would definitely recommend writing your own AI: most of the fun of the game is programming your own bot and watching your little ant farm run! However, I've tried to make the codebase readable and well-documented, so feel free to fork the project or use it as inspiration when writing your AI. If you still want to use Overmind on the public server, that's okay too - there are a number of people already doing this. But please realize that using a mature AI like this gives you a huge advantage over other new players, so don't go out of your way to ruin someone else's fun. In the future, I will be implementing methods for novice players to opt out of excessive aggression by Overmind bots (as long as they don't start a conflict and stay out of its way). Installation Out of the box If you just want to run Overmind without modification, you can copy the compiled main.js file attached to the latest release into your script. While Overmind is fully automated by default, it can be run with varying levels of autonomy; refer to the Overmind wiki for how to configure and operate the bot. Compiling from source To install the full codebase, download or clone the repository. (Please note that while the latest release of Overmind should always be stable, the latest commit may contain unstable features.) Navigate to the Overmind root directory and run . To compile and deploy the codebase, create a screeps.json file from the example file, then do one of the following actions: Compile and deploy to public server: npm run push-main Compile and deploy to private server: npm run push-pserver Compile without deploying: npm run compile Overmind uses rollup to bundle the compiled TypeScript into a single main.js file. The codebase includes functionality to compute checksums for internal validation - if you have a different version of rollup installed globally, different checksums may be computed and some functionality will be disabled. Please ensure the local installation of rollup found in node_modules is used. Setting up the Grafana dashboard Overmind includes a Grafana dashboard (shown below) which tracks detailed operating statistics. To set up the dashboard: Register for grafana service at screepspl.us Setup the ScreepsPlus hosted agent (simpler) or use the NodeJS agent on a free micro instance of Google Compute. Import the dashboard from Overmind.json and change $User to your username. Enjoy your pretty graphs! Design overview Check out the Overmind wiki for in-depth explanations of parts of the design of the AI. (Click the diagram below to see a higher-resolution version.)

YT_Emerging_Technologies_Introduction_to_AI
github
LLM Vibe Score0.461
Human Vibe Score0.039054583141409485
zusmaniJan 17, 2025

YT_Emerging_Technologies_Introduction_to_AI

YouTube Channel: Emerging Technologies Playlist: Introduction to AI Instructor: Zeeshan-ul-hassan Usmani Dear Students, I have uploaded all relevant material here for your quick access and learning. I hope you will find it beneficiary Yours Truly, Zeeshan =========================================== Video title: Resources Books to Order: Artificial Intelligence by Zeeshan Usmani - https://gufhtugu.com/artificial-intelligence Artificial Intelligence by Baqir Naqvi - https://gufhtugu.com/masnoi-zahanat/ Recommended Books • Gödel, Escher, Bach : An Eternal Golden Braid by Douglas R. Hofstadter A classic, poetic, philosophical defense of AI. • Machines Who Think by Pamela McCorduck. A good review of early AI history. • Robot: Mere Machine to Transcendent Mind by Hans P. Moravec Somewhat hyped book by a CMU robotics researcher. • Flesh and Machines: How Robots Will Change Us by Rodney Allen Brooks Reasonably decent book by MIT's leading robotics researcher. • Wired for War by Peter Warren Singer Reviews growing use of robots and unmanned vehicles in warfare. • Behind Deep Blue: Building the Computer That Defeated the World Chess Champion by Feng-Hsiung Hsu Autobiographical book on the development of a history making game-playing system. Interesting personal story of the hard engineering work that went into the system, with a few interesting facts on the technical aspects. • The Age of Spiritual Machines : When Computers Exceed Human Intelligence by Ray Kurzweil A recent view by an AI entrepreneur that has content if you ignore all the hype and overly-optimistic trust that Moore's law will magically solve all of the major problems. • Hal's Legacy : 2001's Computer As Dream and Reality An interesting collection of edited articles written to celebrate the fictional birthday of a famous intelligent computer who's true birthday must unfortunately be delayed, pending AI's inevitable progress. • The Sciences of the Artificial by Herbert Simon AI as science by one of its founders. • Models of My Life by Herbert Simon. An autobiography of one of AI's founders who's intellectual contributions also include fundamental contributions to economics (for which he won the Nobel prize), cognitive psychology, and computer science (such as co-inventing the linked list in the 1950's). • Alan Turing: The Enigma by Alan Hodges. A biography of one of the founders of CS and originator of the Turing test. Also a testimony to the tragic implications of homophobia. • The Emperor's New Mind : Concerning Computers, Minds, and the Laws of Physics and Shadows of the Mind : A Search for the Missing Science of Consciousness and The Large, the Small and the Human Mind by Roger Penrose A completely bogus argument against AI by a hopelessly Platonic mathematician. The last book contains an appended article by Stephen Hawking (a colleague of Penrose's) who of course doesn't buy his bogus argument. • The Mind's New Science : A History of the Cognitive Revolution by Howard Gardner A nice history of the development of cognitive science. • How the Mind Works , The Language Instinct , and Words and Rules : The Ingredients of Language by Steven Pinker Fun reading on lots of interesting issues in modern Cognitive Science and Linguistics if you don't take his exaggerated beliefs in nativism and evolutionary psychology too seriously. • Bots : The Origin of New Species by Andrew Leonard A light, somewhat hyped book on on Internet agents, chatterbots, etc. with a few funny stories. • Mathematics: The Loss of Certainty by Morris Kline A very nice book on the failed enterprise of using logic to build a firm foundation for infallible mathematics and the role of Gödel's Incompleteness Theorem in the philosophy of mathematics. • Incompleteness: The Proof and Paradox of Kurt Gödel by Rebecca Goldstein An interesting biography of Kurt Gödel. Too bad he was such a Platonist that, unlike Turing, he did not understand the true implications of his own theorems (interesting author connection: Goldstein is Pinker's wife). Links: • AAAI AI Topics Basic info on AI from the American Association for Artificial Intelligence: http://www.aaai.org/AITopics/html/welcome.html • Loebner Prize for limited Turing test: http://www.loebner.net/Prizef/loebner-prize.html • IBM's Deep Blue Page: http://www.research.ibm.com/deepblue/ • Robocup: Robotic Soccer Competition: http://www.robocup.org/ • NY Times Article on Proof of the Robbins Theorem: http://www.nytimes.com/library/cyber/week/1210math.html • NY Times article on Bayes Nets at Microsoft Research: http://www.nytimes.com/library/tech/00/07/biztech/articles/17lab.html =========================================== Video title: Numbers Infinity Video Link - •https://www.youtube.com/watch?v=hlXHwMgS06c https://www.cbs.com/shows/numb3rs/ http://numb3rs.wolfram.com/ =========================================== Video title: 20 Hours Rule and Assisgnemnt Assignment - https://www.urdufake2020.cicling.org/ =========================================== Video title: Assignments – P1 Mostly Human - https://money.cnn.com/mostly-human =========================================== Video title: Assignments – P2 Assignment – 2 - https://replika.ai/ Assignment – 3 – Teachable Machines https://teachablemachine.withgoogle.com/ Assignment – 4 – Tensor Flow Playground https://playground.tensorflow.org Assignment – 5 – GPT-3 Paper (175B Parameters) https://debuild.co/ Assignment – 6 - Image GPT-3 https://openai.com/blog/image-gpt/ =========================================== Video title: Create your own Deep Fake 1.https://colab.research.google.com/drive/1mGg_fmvhTpvkPkclw2yKkhALVzmawfvT?usp=sharing 2.https://drive.google.com/drive/folders/1wW1bxRV2S7Ce8gc3VDTzMQABE3-WCc_Y?usp=sharing •go into you gdrive > find cloned folder and ensure that this folder must have: vox-adv-cpk.pth.tar & vox-cpk.pth.tar failes •Aliaksandr Siarohin : https://github.com/AliaksandrSiarohin/first-order-model