DeepSeek AI introduces Codei/O: a novel approach to converting code-based inference patterns into natural language formats to enhance the inference capabilities of LLMS

Large language models (LLMs) have improved significantly in natural language processing, but reasoning remains an ongoing challenge. Although tasks such as mathematical problem solving and code generation benefit from structured training data, wider inference tasks such as logical inference, scientific inference, and symbolic inference are helpful from sparse and fragmented data. Traditional approaches, such as continuous preprocessing on code, often implicitly embed inference signals, making the model difficult to generalize. Even text-based methods of code generation are limited by syntax-specific learning, thus limiting their applicability to tasks related to programming. A more structured approach is needed to expose LLMs to basic inference patterns while maintaining logic stringency.
DeepSeek AI research presents Codei/oa way to convert code-based reasoning to natural language. By converting the original code into an input-output prediction format and passing Reasons after thinking chain (COT)Codei/O allows LLMS to internalize core reasoning processes, e.g. Logical flow planning, decision tree traversal and modular decomposition. Unlike conventional approaches, Codei/O separates reasoning from code syntax, enabling broader applicability while maintaining logical structures.

Technical Overview and Benefits
Codei/O follows a structured data processing pipeline:
- Collect original code files: More than 450k of features were collected from multiple sources, including algorithm repositories and educational programming datasets.
- Standardized data: Use DeepSeek-V2.5 to perfect the collected code to ensure clarity and execution compatibility.
- Generate input and output pairs: Execute functions with different inputs to create structured training examples for various inference tasks.
- Reasoning that produces a chain of thought: Using models like DeepSeek-V2.5, natural language interpretations are generated to provide structured reasoning.
- Verification and improvements: The predictions were verified by execution and incorrect responses were performed to improve inference accuracy.
The main functions of Codei/O:
- Transformative learning: Convert various code patterns to Natural Language Reasonsto make reasoning beyond the programming environment.
- Grammar deletion learning: Convert logical reasoning with Code Syntaximprove the adaptability of reasoning tasks.
- Multitasking improvements: Enhanced performance Symbol, science, logic, mathematics and common sense reasoning fields.
- Verification: Predictions can be passed Cache ground truth matching or re-execution.
- Iterative exquisite: Refined version, Codei/O++, Revised multiple turns Improve the accuracy of reasoning.

Experience results and performance
The impact of Codei/O was tested Four basic models (Scope from 7b to 30b parameters) 14 reasoning benchmarks Covering logic, symbolic reasoning, mathematics, scientific inference and common sense reasoning.
Discover:
- Consistent improvements: Codei/O training results The reasoning benchmark score is higher Compared with traditional training methods.
- A summary of cross-tasks: Unlike existing methods that improve specific tasks but reduce performance elsewhere, Codei/O shows balanced enhancements.
- Comparison with baseline: Codei/o beats datasets, e.g. OpenMathInstruct2, OpenCoder-SFT stages and Webstruct.
- Effectiveness of multi-bending: Codei/O++ further improves the results by iteratively refines incorrect responses and uses execution feedback to further improve the results to improve the quality of inference.
For example, in logical and symbolic reasoning benchmarks, e.g. BBH and CruxeValCodei/o results in significant performance improvements. exist Mathematical reasoning tasks (GSM8K, Mathematics and MMLU-STEM)it demonstrates improvements to existing baselines. Even in Common sense reasoningCodei/O maintains strong results when code-based approaches are usually in the case of effort.

in conclusion
Codei/o proposes a structured approach to enhance the reasoning of LLMS by leveraging the input and output transformation of the actual code. Instead of focusing on isolated inference tasks, it extracts a general inference pattern and converts it into Natural Language Interpretation. This structured learning approach ensures that the model acquires strong reasoning skills in different fields.
Introduced Multi-turn revision (Codei/O ++) The inference accuracy is further improved, indicating that iterative learning from execution feedback can enhance the reliability of the model. By making predictions Verified,Codei/O provides a scalable and reliable method to improve LLM inference.
By bridge Code-based and natural language reasoning,Codei/O provides promising directions for enhancing LLMS cognitive capabilities beyond programming-related tasks.
Check Paper and github pages. All credits for this study are to the researchers on the project. Also, please feel free to follow us twitter And don’t forget to join us 75K+ ml reddit.
Recommended open source AI platform: ‘Intellagent is an open source multi-proxy framework that evaluates complex dialogue AI systems‘ (Promotion)
DeepSeek AI introduces Codei/O: a novel approach to converting code-based inference patterns into natural language formats to enhance the inference functionality of LLMS, first appeared on Marktechpost.