The study presented in this paper explores the potential of five open-source Large Language Models (LLMs) with parameter counts between 32 billion and 49 billion to automate enhancements in code quality and developer productivity. The evaluated models – CodeLlama [1], Command-R [2], Deepseek R1-32B [3], Nemotron [4], and QwQ [5] - were assessed on their ability to refactor a large and complex automotive mechatronic C language function. This assessment focused on adherence to provided code quality standards and successful compilation of the refactored function within a larger code module. The evaluation also compared the impact of parameter count, hyperparameter tuning, model architecture, and fine-tuning. This comparison revealed that larger models showed superior overall performance, though with notable exceptions where smaller models performed better in specific rule categories. Additionally, hyperparameter tuning yielded modest improvements in performance. The study also highlighted that model architecture and fine-tuning had less predictable effects, suggesting further exploration is required. Furthermore, some rules were more difficult to apply than others, and generated code often contained critical logical issues such as uninitialized variable use, excessive placeholders, and missing logic. This paper provides insights into patterns and behaviors observed in the study related to the strengths and weaknesses demonstrated by these open-source LLMs.