Back to Rankings返回排行榜
Top 100 · Reinforcement Learning前 100 · 强化学习
100 repositories sorted by reinforcement learning 按 强化学习 排序,共 100 个仓库
| # | Repository仓库 | Stars | Forks | Language语言 | Issues | Description描述 | Last Commit最后提交 |
|---|---|---|---|---|---|---|---|
| 1 | cs-video-courses Developer-Y | 81.0k | 11.2k | N/A | 0 | List of Computer Science courses with video lectures.包含视频讲座的计算机科学课程列表。 | 2026-05-01 |
| 2 | annotated_deep_learning_paper_implementations labmlai | 66.5k | 6.7k | Python | 28 | 🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠🧑🏫 60 多篇带有并排注释的深度学习论文实现/教程📝;包括变压器(original、xl、switch、feedback、vit、...)、优化器(adam、adabelief、sophia、...)、gans(cyclegan、stylegan2、...)、🎮强化学习(ppo、dqn)、capsnet、distillation、...🧠 | 2026-01-22 |
| 3 | unsloth unslothai | 63.6k | 5.6k | Python | 1054 | Web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.用于在本地训练和运行 Gemma 4、Qwen3.6、DeepSeek、gpt-oss 等开放模型的 Web UI。 | 2026-05-06 |
| 4 | ray ray-project | 42.4k | 7.5k | Python | 2948 | Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.Ray 是一个人工智能计算引擎。 Ray 由一个核心分布式运行时和一组用于加速 ML 工作负载的 AI 库组成。 | 2026-05-06 |
| 5 | gym openai | 37.2k | 8.7k | Python | 113 | A toolkit for developing and comparing reinforcement learning algorithms.用于开发和比较强化学习算法的工具包。 | 2026-03-26 |
| 6 | applied-ml eugeneyan | 28.8k | 3.8k | N/A | 3 | 📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.📚 公司分享他们在生产中的数据科学和机器学习方面的工作的论文和技术博客。 | 2024-07-18 |
| 7 | d2l-en d2l-ai | 28.8k | 5.1k | Python | 121 | Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.交互式深度学习书籍,包含多框架代码、数学和讨论。被来自 70 个国家的 500 所大学采用,包括斯坦福大学、麻省理工学院、哈佛大学和剑桥大学。 | 2024-08-18 |
| 8 | sglang sgl-project | 27.1k | 5.7k | Python | 637 | SGLang is a high-performance serving framework for large language models and multimodal models.SGLang 是一个用于大型语言模型和多模态模型的高性能服务框架。 | 2026-05-06 |
| 9 | examples pytorch | 23.9k | 9.8k | Python | 198 | A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.视觉、文本、强化学习等方面围绕 pytorch 的一组示例。 | 2025-09-01 |
| 10 | reinforcement-learning dennybritz | 22.0k | 6.1k | Jupyter Notebook | 97 | Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.强化学习算法的实施。 Python、OpenAI Gym、Tensorflow。 Sutton 的书和 David Silver 的课程附带的练习和解决方案。 | 2023-07-13 |
| 11 | FinGPT AI4Finance-Foundation | 19.9k | 2.8k | Jupyter Notebook | 76 | FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.FinGPT:开源金融大语言模型! 革新 🔥 我们在 HuggingFace 上发布了经过训练的模型。 | 2026-04-24 |
| 12 | ml-agents Unity-Technologies | 19.4k | 4.4k | C# | 1 | The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.Unity 机器学习代理工具包 (ML-Agents) 是一个开源项目,它使游戏和模拟能够作为使用深度强化学习和模仿学习训练智能代理的环境。 | 2026-05-05 |
| 13 | trl huggingface | 18.3k | 2.7k | Python | 550 | Train transformer language models with reinforcement learning.通过强化学习训练 Transformer 语言模型。 | 2026-05-05 |
| 14 | tensor2tensor tensorflow | 17.2k | 3.7k | Python | 575 | Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.深度学习模型和数据集库,旨在使深度学习更容易访问并加速机器学习研究。 | 2023-06-02 |
| 15 | agent-lightning microsoft | 17.1k | 1.5k | Python | 102 | The absolute trainer to light up AI agents.点亮AI特工的绝对训练师。 | 2026-04-29 |
| 16 | baselines openai | 16.7k | 4.9k | Python | 413 | OpenAI Baselines: high-quality implementations of reinforcement learning algorithmsOpenAI Baselines:强化学习算法的高质量实现 | 2024-08-01 |
| 17 | leedl-tutorial datawhalechina | 16.5k | 3.1k | Jupyter Notebook | 2 | 《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases | 2025-11-23 |
| 18 | numpy-ml ddbourgin | 16.3k | 3.8k | Python | 23 | Machine learning, in numpy机器学习,用 numpy | 2023-10-29 |
| 19 | Book-Mathematical-Foundation-of-Reinforcement-Learning MathFoundationRL | 15.9k | 1.5k | MATLAB | 0 | This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."这是一本名为“强化学习的数学基础”的新书的主页。 | 2026-03-26 |
| 20 | FinRL AI4Finance-Foundation | 15.1k | 3.3k | Jupyter Notebook | 286 | FinRL®: Financial Reinforcement Learning. 🔥FinRL®:金融强化学习。 🔥 | 2026-04-05 |
| 21 | reinforcement-learning-an-introduction ShangtongZhang | 14.6k | 5.0k | Python | 16 | Python Implementation of Reinforcement Learning: An Introduction强化学习的 Python 实现:简介 | 2024-08-09 |
| 22 | bullet3 bulletphysics | 14.5k | 3.1k | C++ | 261 | Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.BulletPhysics SDK:用于 VR、游戏、视觉效果、机器人、机器学习等的实时碰撞检测和多物理模拟。 | 2025-10-22 |
| 23 | easy-rl datawhalechina | 14.1k | 2.2k | Jupyter Notebook | 56 | 强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/ | 2025-12-30 |
| 24 | awesome-artificial-intelligence owainlewis | 13.7k | 2.2k | N/A | 30 | A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.人工智能 (AI) 课程、书籍、视频讲座和论文的精选列表。 | 2025-08-12 |
| 25 | stable-baselines3 DLR-RM | 13.2k | 2.1k | Python | 54 | PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms. PyTorch 版本的稳定基线,强化学习算法的可靠实现。 | 2026-05-02 |
| 26 | deep-learning-drizzle kmario23 | 12.8k | 3.0k | HTML | 4 | Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!通过学习这些激动人心的讲座,让自己沉浸在深度学习、强化学习、机器学习、计算机视觉和 NLP 的知识中! | 2024-10-19 |
| 27 | Gymnasium Farama-Foundation | 11.8k | 1.3k | Python | 76 | An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)单代理强化学习环境的 API 标准,具有流行的参考环境和相关实用程序(以前称为 Gym) | 2026-05-01 |
| 28 | spinningup openai | 11.8k | 2.5k | Python | 180 | An educational resource to help anyone learn deep reinforcement learning.帮助任何人学习深度强化学习的教育资源。 | 2024-08-05 |
| 29 | wandb wandb | 11.0k | 864 | Python | 705 | The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.人工智能开发者平台。使用权重和偏差来训练和微调模型,并管理模型从实验到生产的整个过程。 | 2026-05-06 |
| 30 | amazon-sagemaker-examples aws | 10.9k | 7.0k | Jupyter Notebook | 723 | Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker. 示例 📓 Jupyter 笔记本演示如何使用 🧠 Amazon SageMaker 构建、训练和部署机器学习模型。 | 2026-04-27 |
| 31 | dopamine google | 10.9k | 1.4k | Jupyter Notebook | 87 | Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. 多巴胺是一个用于强化学习算法快速原型设计的研究框架。 | 2026-03-24 |
| 32 | tianshou thu-ml | 10.6k | 1.3k | Python | 133 | An elegant PyTorch deep reinforcement learning library.一个优雅的 PyTorch 深度强化学习库。 | 2026-04-03 |
| 33 | awesome-rl aikorea | 9.7k | 1.9k | N/A | 4 | Reinforcement learning resources curated强化学习资源策划 | 2023-05-25 |
| 34 | cleanrl vwxyzjn | 9.7k | 1.1k | Python | 66 | High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)深度强化学习算法的高质量单文件实现,具有研究友好的功能(PPO、DQN、C51、DDPG、TD3、SAC、PPG) | 2026-04-20 |
| 35 | Reinforcement-learning-with-tensorflow MorvanZhou | 9.5k | 5.0k | Python | 69 | Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学 | 2024-03-31 |
| 36 | OpenRLHF OpenRLHF | 9.4k | 934 | Python | 295 | An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)基于 Ray 的易于使用、可扩展且高性能的 Agentic RL 框架(PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL) | 2026-05-05 |
| 37 | ART OpenPipe | 9.4k | 824 | Python | 65 | Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen3.5, GPT-OSS, Llama, and more!代理强化训练器:使用 GRPO 训练多步骤代理以执行实际任务。为您的代理提供在职培训。 Qwen3.5、GPT-OSS、Llama 等的强化学习! | 2026-05-05 |
| 38 | TensorFlow-Tutorials Hvass-Labs | 9.3k | 4.1k | Jupyter Notebook | 1 | TensorFlow Tutorials with YouTube Videos带有 YouTube 视频的 TensorFlow 教程 | 2021-01-15 |
| 39 | pwnagotchi evilsocket | 9.1k | 1.2k | Python | 268 | (⌐■_■) - Deep Reinforcement Learning instrumenting bettercap for WiFi pwning.(⌐■_■) - 深度强化学习工具为 WiFi pwning 提供了更好的帮助。 | 2025-08-23 |
| 40 | machine_learning_examples lazyprogrammer | 8.9k | 6.4k | Python | 5 | A collection of machine learning examples and tutorials.机器学习示例和教程的集合。 | 2026-04-27 |
| 41 | vowpal_wabbit VowpalWabbit | 8.7k | 1.9k | C++ | 0 | Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning. Vowpal Wabbit 是一个机器学习系统,它通过在线、散列、allreduce、缩减、learning2search、主动和交互式学习等技术推动机器学习的前沿。 | 2026-05-05 |
| 42 | PyTorch-Tutorial MorvanZhou | 8.5k | 3.1k | Jupyter Notebook | 24 | Build your neural network easy and fast, 莫烦Python中文教学 | 2023-03-23 |
| 43 | trax google | 8.3k | 825 | Python | 107 | Trax — Deep Learning with Clear Code and SpeedTrax — 代码清晰、速度快的深度学习 | 2025-09-26 |
| 44 | pysc2 google-deepmind | 8.3k | 1.2k | Python | 56 | StarCraft II Learning Environment星际争霸 II 学习环境 | 2024-07-23 |
| 45 | PaLM-rlhf-pytorch lucidrains | 7.9k | 679 | Python | 17 | Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLMImplementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture.基本上是 ChatGPT,但使用 PaLM | 2025-10-11 |
| 46 | PokemonRedExperiments PWhiddy | 7.8k | 773 | Jupyter Notebook | 67 | Playing Pokemon Red with Reinforcement Learning通过强化学习玩《口袋妖怪红》 | 2025-08-28 |
| 47 | TensorLayer tensorlayer | 7.4k | 1.6k | Python | 26 | Deep Learning and Reinforcement Learning Library for Scientists and Engineers Error 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2023-02-18 |
| 48 | Awesome-LLM-Strawberry hijkzzz | 6.9k | 368 | N/A | 6 | A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.LLM 论文、博客和项目的集合,重点关注 OpenAI o1 🍓 和推理技术。 | 2025-12-17 |
| 49 | awesome-multimodal-ml pliang279 | 6.9k | 899 | N/A | 6 | Reading list for research topics in multimodal machine learning多模态机器学习研究主题的阅读清单 | 2024-08-20 |
| 50 | DeepLearningFlappyBird yenchenlin | 6.8k | 2.1k | Python | 25 | Flappy Bird hack using Deep Reinforcement Learning (Deep Q-learning).使用深度强化学习(深度 Q 学习)进行 Flappy Bird 破解。 | 2024-08-07 |
| 51 | Practical_RL yandexdataschool | 6.5k | 1.8k | Jupyter Notebook | 37 | A course in reinforcement learning in the wild野外强化学习课程 | 2026-03-31 |
| 52 | ai-engineering-from-scratch rohitg00 | 6.4k | 1.3k | Python | 2 | Learn it. Build it. Ship it for others.学习它。建造它。寄给别人。 | 2026-04-28 |
| 53 | awesome-self-supervised-learning jason718 | 6.4k | 837 | N/A | 1 | A curated list of awesome self-supervised methods一系列很棒的自我监督方法 | 2026-02-24 |
| 54 | tensorpack tensorpack | 6.3k | 1.8k | Python | 13 | A Neural Net Training Interface on TensorFlow, with focus on speed + flexibilityError 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2023-08-06 |
| 55 | tensortrade tensortrade-org | 6.2k | 1.2k | Python | 37 | An open source reinforcement learning framework for training, evaluating, and deploying robust trading agents.一个开源强化学习框架,用于训练、评估和部署强大的交易代理。 | 2026-02-19 |
| 56 | VLM-R1 om-ai-lab | 6.0k | 377 | Python | 164 | Solve Visual Understanding with Reinforced VLMs使用增强型 VLM 解决视觉理解问题 | 2026-03-12 |
| 57 | Deep-Reinforcement-Learning-Algorithms-with-PyTorch p-christ | 5.9k | 1.2k | Python | 44 | PyTorch implementations of deep reinforcement learning algorithms and environments深度强化学习算法和环境的 PyTorch 实现 | 2024-07-25 |
| 58 | awesome-ai-in-finance georgezouq | 5.9k | 681 | N/A | 7 | 🔬 A curated list of awesome LLMs & deep learning strategies & tools in financial market.🔬 金融市场中出色的法学硕士以及深度学习策略和工具的精选列表。 | 2026-05-02 |
| 59 | PufferLib PufferAI | 5.7k | 448 | C | 26 | Puffing up reinforcement learning加强强化学习 | 2026-05-05 |
| 60 | keras-rl keras-rl | 5.6k | 1.3k | Python | 14 | Deep Reinforcement Learning for Keras.Keras 的深度强化学习。 | 2023-09-17 |
| 61 | rllm rllm-org | 5.5k | 551 | Python | 87 | Democratizing Reinforcement Learning for LLMsError 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2026-05-06 |
| 62 | open_spiel google-deepmind | 5.2k | 1.1k | C++ | 14 | OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.Error 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2026-05-02 |
| 63 | deep-reinforcement-learning udacity | 5.2k | 2.4k | Jupyter Notebook | 2 | Repo for the Deep Reinforcement Learning Nanodegree program深度强化学习纳米学位项目的存储库 | 2023-11-16 |
| 64 | AReaL inclusionAI | 5.1k | 487 | Python | 33 | The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.用于基于 LLM 的代理应用程序的 RL Bridge。变得简单且灵活。 | 2026-05-05 |
| 65 | xtuner InternLM | 5.1k | 419 | Python | 240 | A Next-Generation Training Engine Built for Ultra-Large MoE Models专为超大型 MoE 模型打造的下一代训练引擎 | 2026-05-05 |
| 66 | Sana NVlabs | 5.1k | 346 | Python | 99 | SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion TransformerSANA:使用线性扩散变压器进行高效高分辨率图像合成 | 2026-04-14 |
| 67 | EasyR1 hiyouga | 4.9k | 372 | Python | 47 | EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRLEasyR1:基于 veRL 的高效、可扩展、多模态 RL 训练框架 | 2026-04-06 |
| 68 | deep-rl-class huggingface | 4.9k | 789 | MDX | 82 | This repo contains the Hugging Face Deep Reinforcement Learning Course.该存储库包含拥抱面部深度强化学习课程。 | 2026-04-17 |
| 69 | MARL-Papers LantaoYu | 4.8k | 775 | N/A | 3 | Paper list of multi-agent reinforcement learning (MARL)多智能体强化学习(MARL)论文列表 | 2026-02-11 |
| 70 | trlx CarperAI | 4.7k | 484 | Python | 86 | A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)通过人类反馈(RLHF)进行强化学习的语言模型分布式训练的存储库 | 2024-01-08 |
| 71 | Reinforcement-Learning andri27-ts | 4.7k | 669 | Jupyter Notebook | 4 | Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep LearningError 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2020-06-30 |
| 72 | Deep-reinforcement-learning-with-pytorch sweetice | 4.6k | 899 | Python | 28 | PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....PyTorch 实现 DQN、AC、ACER、A2C、A3C、PG、DDPG、TRPO、PPO、SAC、TD3 和... | 2023-03-24 |
| 73 | DRL wangshusen | 4.6k | 679 | N/A | 41 | Deep Reinforcement Learning深度强化学习 | 2022-12-10 |
| 74 | dm_control google-deepmind | 4.6k | 748 | Python | 110 | Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.Google DeepMind 使用 MuJoCo 的软件堆栈,用于基于物理的模拟和强化学习环境。 | 2026-05-02 |
| 75 | DouZero kwai | 4.5k | 643 | Python | 32 | [ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AIError 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2024-06-26 |
| 76 | TensorFlow-Book BinRoot | 4.4k | 1.2k | Jupyter Notebook | 13 | Accompanying source code for Machine Learning with TensorFlow. Refer to the book for step-by-step explanations.附带 TensorFlow 机器学习源代码。请参阅本书的分步说明。 | 2023-03-17 |
| 77 | alpha-zero-general suragnair | 4.4k | 1.2k | Jupyter Notebook | 48 | A clean implementation based on AlphaZero for any game in any framework + tutorial + Othello/Gobang/TicTacToe/Connect4 and moreError 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2025-01-01 |
| 78 | neurojs janhuenermann | 4.4k | 363 | JavaScript | 4 | A JavaScript deep learning and reinforcement learning library.Error 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2023-10-10 |
| 79 | awesome-RLHF opendilab | 4.4k | 252 | N/A | 0 | A curated list of reinforcement learning with human feedback resources (continually updated)带有人类反馈资源的强化学习精选列表(持续更新) | 2025-12-09 |
| 80 | ElegantRL AI4Finance-Foundation | 4.3k | 970 | Python | 151 | Massively Parallel Deep Reinforcement Learning. 🔥大规模并行深度强化学习。 🔥 | 2026-02-20 |
| 81 | reasoning-from-scratch rasbt | 4.3k | 618 | Jupyter Notebook | 2 | Implement a reasoning LLM in PyTorch from scratch, step by step从头开始,一步步在 PyTorch 中实现推理 LLM | 2026-04-21 |
| 82 | LLM-RL-Visualized changyeyu | 4.2k | 400 | Python | 3 | 🌟100+ 原创 LLM / RL 原理图📚,《大模型算法》作者巨献!💥(100+ LLM/RL Algorithm Maps )🌟100+ 原创 LLM / RL 原理图📚,《大模型算法》作者巨献! 💥(100+ LLM/RL Algorithm Maps ) | 2026-04-21 |
| 83 | Awesome-ChatGPT dalinvip | 4.2k | 387 | N/A | 1 | ChatGPT资料汇总学习,持续更新...... | 2025-06-04 |
| 84 | acme google-deepmind | 4.0k | 535 | Python | 69 | A library of reinforcement learning components and agentsError 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2026-04-08 |
| 85 | fsrs4anki open-spaced-repetition | 3.9k | 159 | Jupyter Notebook | 9 | A modern Anki custom scheduling based on Free Spaced Repetition Scheduler algorithm基于 Free Spaced Repetition Scheduler 算法的现代 Anki 自定义调度 | 2026-03-20 |
| 86 | Deep_reinforcement_learning_Course simoninithomas | 3.9k | 1.2k | Jupyter Notebook | 40 | Implementations from the free course Deep Reinforcement Learning with Tensorflow and PyTorch免费课程《使用 Tensorflow 和 PyTorch 进行深度强化学习》的实现 | 2023-05-02 |
| 87 | arXivTimes arXivTimes | 3.9k | 200 | N/A | 2050 | repository to research & share the machine learning articles研究和分享机器学习文章的存储库 | 2022-07-01 |
| 88 | pytorch-a2c-ppo-acktr-gail ikostrikov | 3.9k | 845 | Python | 88 | PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).PyTorch 实现了 Advantage Actor Critic (A2C)、近端策略优化 (PPO)、使用 Kronecker 因子近似 (ACKTR) 和生成对抗模仿学习 (GAIL) 进行深度强化学习的可扩展信任域方法。 | 2022-05-29 |
| 89 | polyaxon polyaxon | 3.7k | 324 | N/A | 121 | MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycleError 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2026-04-26 |
| 90 | ReAgent facebookresearch | 3.7k | 529 | Python | 34 | A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)推理系统平台(强化学习、上下文强盗等) | 2026-05-05 |
| 91 | Andrew-NG-Notes ashishpatel26 | 3.7k | 1.2k | Jupyter Notebook | 1 | This is Andrew NG Coursera Handwritten Notes.Error 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2025-08-01 |
| 92 | reinforcement-learning rlcode | 3.6k | 738 | Python | 26 | Minimal and Clean Reinforcement Learning ExamplesError 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2023-03-24 |
| 93 | DI-engine opendilab | 3.6k | 433 | Python | 11 | OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.Error 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2025-12-07 |
| 94 | awesome-DeepLearning PaddlePaddle | 3.6k | 859 | Jupyter Notebook | 49 | 深度学习入门课、资深课、特色课、学术案例、产业实践案例、深度学习知识百科及面试题库The course, case and knowledge of Deep Learning and AI | 2024-07-25 |
| 95 | AlphaZero_Gomoku junxiaosong | 3.6k | 1.0k | Python | 76 | An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row) 五子棋(也称为五子棋或五子棋)的 AlphaZero 算法的实现 | 2024-04-24 |
| 96 | football google-research | 3.6k | 1.4k | Python | 81 | Check out the new game server:Error 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2025-06-17 |
| 97 | introRL zhoubolei | 3.6k | 497 | N/A | 2 | Intro to Reinforcement Learning (强化学习纲要) | 2020-07-25 |
| 98 | AI-ML-Roadmap-from-scratch aadi1011 | 3.6k | 682 | N/A | 0 | Become skilled in Artificial Intelligence, Machine Learning, Generative AI, Deep Learning, Data Science, Natural Language Processing, Reinforcement Learning and more with this complete 0 to 100 repository.通过这个完整的 0 到 100 存储库,熟练掌握人工智能、机器学习、生成式 AI、深度学习、数据科学、自然语言处理、强化学习等。 | 2026-05-03 |
| 99 | Reco-papers wzhe06 | 3.5k | 815 | Python | 3 | Classic papers and resources on recommendationError 500 (Server Error)!!1500.That’s an error.There was an error. Please try again later.That’s all we know. | 2025-10-16 |
| 100 | maths-cs-ai-compendium HenryNdubuaku | 3.5k | 488 | TypeScript | 1 | Become a cracked AI/ML Research Engineer成为一名出色的 AI/ML 研究工程师 | 2026-04-17 |
No repositories match your search
没有匹配的仓库