Image

Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

What are the criteria to define if a programming language is compiled or interpreted?

+6
−0

Often I hear people saying that some language is either compiled or interpreted. Examples: "Java is compiled" or "JavaScript is interpreted" and so on.

What are the criteria to define languages as such?

History

0 comment threads

2 answers

+7
−0

First of all, let's recap two basic concepts: specification and implementation. The classic analogy is that a specification is like a cake recipe, and implementation is the actual cake. The spec only tells you what to do (it might say how some steps can be done, but it's not always the case). And the implementation is when all the steps are actually executed.

All languages have (or at least should have) a specification that defines its basic features: the syntax (keywords, statements, variables, functions/classes/whatever structure it has, etc) and semantics (what each command or statement means).

But with the spec you can't actually use the language to do programs. We need the implementation, which consists of one or more programs to read the source code, validate it and execute it according to the rules defined by the spec.

So when we say that some language is compiled, we're actually talking about the implementation being used. The language (the specification) is just the text that describes how it works. The implementation can use a compiler, that reads the source code and transforms it into something that can be executed.

I understand this confusion, as many times we say "language X is compiled" as a simplification, to facilitate communication and understanding. And all languages usually have one implementation that is the most used (if not the only one), and if this implementation uses a compiler, everybody ends up saying that such language is compiled. Even if there are other implementations that do it differently.

Therefore, being compiled or interpreted is a characteristic of the implementation, not of the language. Specifications usually don't have such restriction, as we can find things like C interpreters, for example. If this is useful or will be widely used, that's another story. The point is: there's no technical restriction.


This "compiled vs interpreted" thing is a false dichotomy, because it makes you think that an implementation can be only one of them. But things are actually more complicated than that.

Until today there are people who thinks that a compiler only transforms source code into native/machine code. Like GCC, a C compiler that generates an executable.

But Java, for example, is different: its compiler converts the source code to a bytecode, an intermediary binary format (for Java, that's the .class files). This byte code is then executed by a virtual machine (VM). For Java, it's the JVM (Java Virtual Machine).

Although many people think that all of this is just one thing, it's not. There are two distinct specifications: one for the language (Java Language Specification) and another one for the JVM (Java Virtual Machine Specification). This allows other languages to run on the JVM: you just need to create a compiler that, instead of generating machine code, generates JVM-compatible byte code. And many were created, such as for Python, PHP and JavaScript, just to mention some mainstream languages.

BTW, having two separate specs also allows the existence of many JVM implementations.

Anyway, Java compilers don't generate machine code like C compilers, there's an intermediate step to generate the byte code. C# does the same: the compiler generates byte code (CIL - Common Intermediate Language), which is executed by a VM (CLR - Common Language Runtime). And it's common to hear people saying that Java and C# are "compiled languages", so the idea that a compiler always generate machine code is not true.

But if "generate byte code that is executed by a VM" is a characteristic of a "compiled language", then why there are people who says that Python and JavaScript are interpreted? After all, their most used implementations also does the same as Java and C#.

For Python, the most used implementation is CPython, which compiles the source and generates byte code, which in turn is executed by a VM. For JavaScript, the most used implementations are V8 (Google, used by Chrome, Node.js and Deno), SpiderMonkey (Firefox) and JavaScriptCore (Safari), and all of them generate byte code which is then executed by a VM.

The difference is that in Java and C#, the compilation and VM's execution are made in separate steps, while Python and JS do everything in one single step, giving the impression that it's all one thing. But for all of them, the process is the same (compile -> byte code -> VM). Even though, you still see people saying that Java and C# are compiled, while Python and JS are interpreted.

"Well, then they're what?"

This "compiled vs interpreted" thing is the wrong way to think, when it comes to how languages are implemented. Stop trying to fit languages in one of those boxes, because they're not even boxes, it's more like a big soup with all mixed up.

The pure interpretation of source code is a rare thing nowadays. Some notable exceptions are Shell Script interpreters, such as Bash, C Shell, Z Shell, etc, and some Lisp implementations. What's more common today is a hybrid approach, as many implementations are using a compiler to generate byte code, and a VM to execute it.

BTW, the VM can be implemented as an interpreter, executing commands as they are read from the byte code (which is a binary format, which means that source code interpretation has became rare, while byte code interpretation hasn't). Hence, we could say that implementations that use this hybrid approach are both "compiled" and "interpreted", but each one is used in different parts of the process.

And there's more: while the VM is executing the byte code, it can also compile it to native/machine code, and it usually does that during the execution. That's called JIT (Just in Time compiler). That's right, inside the VM there may be another compiler, but instead of converting source code to binary, it converts binary (byte code) to another binary (machine code, usually optimized for the environment it's running on). Java has a JIT, and so C#, JavaScript (yes, the browser does that behind the scenes) and PHP (since version 8). CPython (for Python 3.14) has an experimental JIT (disabled by default), but there are other implementations such as PyPy, that has a JIT.

Nowadays, many implementations use this approach: a compiler to generate byte code, which is executed by a VM (usually implemented as an interpreter). And the VM can have another compiler to convert the byte code to machine code. All of that shows how the "compiled vs interpreted" thing is outdated and doesn't make sense anymore.


Disclaimer: this answer is based on this post in my blog (in Portuguese).

History

0 comment threads

+0
−4

if the code is compiled everytime before running, then it is interpreted.

If you get a ready executable binary and run it again and again then it is compiled.

Java is sorted of both. you get executable binary of byte code, but every running jvm compile it again to machine code before running.

History

1 comment thread

Welcome to Codidact! It looks like you're having a rough start here; I'm sorry to see that. That... (3 comments)

Sign up to answer this question »