| File |
Date |
Author |
Commit |
|
AMD64
|
2021-06-19
|
gatewood <>
|
[cd1c62]
Move .LSleef_rempitabdp to rempitabdp.s
|
|
BOOK
|
2021-08-29
|
gatewood <>
|
[13e773]
Add qpdf to README, update NEWS, bump version t...
|
|
CSCLASSICS
|
2021-10-18
|
gatewood
|
[8f93a8]
Add CSCLASSICS/COSINE.BAS demo program
|
|
HAM
|
2021-11-13
|
gatewood
|
[f190f9]
Use only 1 million loop iterations in P367.BAS
|
|
HAM_compile_output
|
2021-07-07
|
gatewood <>
|
[12adbe]
Fix error message when DEL character (ASCII 127...
|
|
HAM_run_input
|
2020-01-26
|
gatewood <>
|
[1f6445]
Convert HAM test suite to new style based on GN...
|
|
HAM_run_output
|
2021-06-29
|
gatewood <>
|
[570f9e]
Fix wide expected output for more tests
|
|
ISAAC64
|
2015-05-11
|
gatewood <>
|
[08cfaf]
Added Bob Jenkins' original public domain C lan...
|
|
NBS
|
2021-06-29
|
gatewood <>
|
[40e0da]
Fix benign typo in NBS test P008.BAS file and i...
|
|
NBS_compile_output
|
2021-07-06
|
gatewood <>
|
[d20252]
Fixed error message when strange characters are...
|
|
NBS_run_input
|
2014-04-04
|
gatewood <>
|
[4ba8fa]
Correct NBS test 203 input/output test files
|
|
NBS_run_output
|
2021-07-03
|
gatewood <>
|
[8f3d0c]
Add suport for INWIDE=1 variable to Makefile.ru...
|
|
SLEEF
|
2021-01-23
|
gatewood <>
|
[39ea56]
SLEEF/AVX -> SLEEF/SLEEF-3.4.1-AVX, SLEEF/SSE2 ...
|
|
benchmark
|
2016-12-30
|
gatewood <>
|
[444383]
add benchmark stuff
|
|
dgay
|
2015-05-22
|
gatewood <>
|
[208ef8]
add back lost patch
|
|
tests
|
2021-05-09
|
gatewood <>
|
[a4b304]
Add supplementary tests/MRTIME.BAS example
|
|
unit_tests_run_output
|
2023-12-18
|
gatewood
|
[3c3e01]
Rewrite self-test code so output lines don't ge...
|
|
vDSO
|
2020-06-04
|
gatewood <>
|
[33e1e4]
Properly use vDSO to access gettimeofday() for ...
|
|
.gitattributes
|
2020-03-04
|
gatewood <>
|
[5bdeb4]
add .gitattributes
|
|
BASICC
|
2021-06-19
|
gatewood <>
|
[90d9af]
Update BASICC, BASICCS, & BASICCW to hunt for as
|
|
BASICC.1
|
2021-06-01
|
gatewood <>
|
[5185de]
Remove AVX support
|
|
BASICC.clang
|
2021-06-19
|
gatewood <>
|
[4980e6]
Add BASICC.clang, BASICCS.clang, & BASICCW.clang
|
|
BASICCS
|
2021-06-19
|
gatewood <>
|
[90d9af]
Update BASICC, BASICCS, & BASICCW to hunt for as
|
|
BASICCS.1
|
2021-06-01
|
gatewood <>
|
[5185de]
Remove AVX support
|
|
BASICCS.clang
|
2021-06-19
|
gatewood <>
|
[4980e6]
Add BASICC.clang, BASICCS.clang, & BASICCW.clang
|
|
BASICCW
|
2021-06-19
|
gatewood <>
|
[90d9af]
Update BASICC, BASICCS, & BASICCW to hunt for as
|
|
BASICCW.1
|
2021-06-01
|
gatewood <>
|
[5185de]
Remove AVX support
|
|
BASICCW.clang
|
2021-06-19
|
gatewood <>
|
[4980e6]
Add BASICC.clang, BASICCS.clang, & BASICCW.clang
|
|
BOOST_LICENSE-1.0.TXT
|
2020-05-12
|
gatewood <>
|
[c5a7be]
Replace SSE2 versions of SIN, COS, and TAN with...
|
|
CC0-1.0-Universal
|
2020-06-04
|
gatewood <>
|
[33e1e4]
Properly use vDSO to access gettimeofday() for ...
|
|
COPYING
|
2014-07-15
|
gatewood <>
|
[8b0f0f]
convert tabs to spaces
|
|
ChangeLog
|
2023-12-29
|
gatewood
|
[006ed5]
Cleanups in error handling in parser2.c file
|
|
ECMA-116-NUMERIC-FUNCTIONS.TXT
|
2021-03-02
|
gatewood <>
|
[364f50]
Fix spelling error
|
|
ECMA-55.TXT
|
2021-07-13
|
gatewood <>
|
[0eed43]
Apply patch from Doug Kearns for a typo on ECMA...
|
|
ECMA55-slideshow.odp
|
2017-11-19
|
gatewood <>
|
[05aff5]
Update ECMA55-Slideshow documents
|
|
ECMA55-slideshow.pdf
|
2017-11-19
|
gatewood <>
|
[05aff5]
Update ECMA55-Slideshow documents
|
|
GNU_FDL
|
2016-06-09
|
gatewood <>
|
[3df88b]
Documentation uses the GNU FDL Version 1.3
|
|
INSTALL
|
2021-07-03
|
gatewood <>
|
[96ce7d]
Document in INSTALL how to create and test 132 ...
|
|
INTEL_CET.TXT
|
2021-07-02
|
gatewood <>
|
[ae849e]
Reflow paragraphs for 80 columns and make some ...
|
|
LUCENT_LICENSE.TXT
|
2020-05-14
|
gatewood <>
|
[54dcac]
Make -l/-L show more license information
|
|
Makefile.clang
|
2023-12-18
|
gatewood
|
[1c8277]
Update Makefile.clang to match recent Makefile....
|
|
Makefile.gcc
|
2023-12-21
|
gatewood
|
[521157]
Disable analyzer for sha256, since heavy-duty m...
|
|
Makefile.runtests
|
2023-12-18
|
gatewood
|
[ddec13]
Simplify linking in of assembly files, removing...
|
|
Makefile.runtests2
|
2023-12-18
|
gatewood
|
[ddec13]
Simplify linking in of assembly files, removing...
|
|
Makefile.tcc
|
2023-12-24
|
gatewood
|
[7e1d52]
Update Makefile.tcc to pass LINKER argument to ...
|
|
NEWS
|
2023-12-24
|
gatewood
|
[cc550c]
Update NEWS and README for recent changes
|
|
PUFF_LICENSE.TXT
|
2021-11-22
|
gatewood
|
[7f6d12]
Added Mark Adler's puff.[ch] from zlib-1.2.11 c...
|
|
README
|
2023-12-24
|
gatewood
|
[cc550c]
Update NEWS and README for recent changes
|
|
README.clang
|
2021-05-01
|
gatewood <>
|
[91050c]
Update README.clang to note that versions >= 12...
|
|
README.pcc
|
2021-07-02
|
gatewood <>
|
[a0e0d3]
Clarify the problems with pcc a bit more in the...
|
|
TESTING
|
2021-07-02
|
gatewood <>
|
[c00c0c]
Add TESTING file to explain how to run the self...
|
|
THANKS
|
2021-07-13
|
gatewood <>
|
[0eed43]
Apply patch from Doug Kearns for a typo on ECMA...
|
|
TODO
|
2021-07-05
|
gatewood <>
|
[253dd7]
Update TODO list
|
|
asmgen.c
|
2023-12-18
|
gatewood
|
[57de54]
Be more careful about flushing output in asmgen...
|
|
asmgen.h
|
2021-01-19
|
gatewood <>
|
[56aa70]
Update copyright year in files
|
|
ast.c
|
2021-06-01
|
gatewood <>
|
[5185de]
Remove AVX support
|
|
ast.h
|
2021-01-19
|
gatewood <>
|
[56aa70]
Update copyright year in files
|
|
codegen.c
|
2023-12-18
|
gatewood
|
[ddec13]
Simplify linking in of assembly files, removing...
|
|
codegen.h
|
2021-06-13
|
gatewood <>
|
[b65409]
Make more code generation function names begin ...
|
|
computers-03-00069.pdf
|
2015-04-01
|
gatewood <>
|
[ada319]
Add a copy of the MDPI Computers paper I wrote ...
|
|
dag.c
|
2021-11-10
|
gatewood
|
[dc5487]
Switch from malloc() to xmalloc()
|
|
dag.h
|
2021-01-19
|
gatewood <>
|
[56aa70]
Update copyright year in files
|
|
datum.dot
|
2016-08-02
|
gatewood <>
|
[375e0d]
Update datum.dot to reflect current reality whe...
|
|
dtoa5_normal.c
|
2021-04-21
|
gatewood <>
|
[d95a98]
Fix bit-shift undefined behavior
|
|
dtoa5_normal.h
|
2014-06-30
|
gatewood <>
|
[346b90]
add missing header guard macros
|
|
dumpregs.s
|
2020-05-11
|
gatewood <>
|
[4d4400]
Fix wrong comments on EFLAGS processing for dum...
|
|
dumpstack.s
|
2020-05-11
|
gatewood <>
|
[f5fdea]
Add public domain stack dumper
|
|
ecma55.1
|
2021-11-23
|
gatewood
|
[7cd242]
Add license information for puff.[ch] to ecma55...
|
|
error_messages.c
|
2021-10-23
|
gatewood
|
[7bb5b2]
Convert FLOAT_BUFFER_LEN to named constant
|
|
error_messages.h
|
2021-01-19
|
gatewood <>
|
[56aa70]
Update copyright year in files
|
|
g_fmt_BASIC.s
|
2016-12-31
|
gatewood <>
|
[49bb8d]
add missing attribution
|
|
g_fmt_BASIC_normal.c
|
2020-04-21
|
gatewood <>
|
[a0c90f]
Fix warning from clang about implicit conversion
|
|
g_fmt_BASIC_normal.h
|
2014-06-30
|
gatewood <>
|
[346b90]
add missing header guard macros
|
|
globals.c
|
2023-05-06
|
gatewood
|
[23953c]
Silence warnings from gcc-13.1.0 analyzer
|
|
globals.h
|
2021-11-08
|
gatewood
|
[820ba4]
Update compiler to use new and improved hexdump2
|
|
grammar.txt
|
2020-11-12
|
gatewood <>
|
[c7c691]
Improve discussion about unary minus problems
|
|
hexdump2.1
|
2021-11-03
|
gatewood
|
[2134e4]
Major fixes for hexdump2/hexdump2mm
|
|
hexdump2.c
|
2023-12-24
|
gatewood
|
[d2a21c]
Fix memory allocation error in hexdump2 found w...
|
|
hexdump2.h
|
2021-11-08
|
gatewood
|
[820ba4]
Update compiler to use new and improved hexdump2
|
|
load_textdata.c
|
2023-12-08
|
gatewood
|
[131148]
Fix the load_textdata WRITEWITHNEWLINE by addin...
|
|
load_textdata.h
|
2023-12-18
|
gatewood
|
[ddec13]
Simplify linking in of assembly files, removing...
|
|
main.c
|
2023-12-09
|
gatewood
|
[d8eacb]
Fix leaks in parser2.c when FATAL() is called, ...
|
|
mathnotes.txt
|
2021-06-01
|
gatewood <>
|
[5185de]
Remove AVX support
|
|
optimizer.c
|
2023-11-20
|
gatewood
|
[64d0ab]
Add tree_postorder_rw() for updating AST in-place
|
|
optimizer.h
|
2021-01-19
|
gatewood <>
|
[56aa70]
Update copyright year in files
|
|
parseinput.c
|
2021-06-12
|
gatewood <>
|
[c61ef7]
remove dead code
|
|
parseinput.txt
|
2020-01-13
|
gatewood <>
|
[d46a57]
Remove trailing whitespace
|
|
parser2.c
|
2023-12-29
|
gatewood
|
[006ed5]
Cleanups in error handling in parser2.c file
|
|
parser2.h
|
2021-01-19
|
gatewood <>
|
[56aa70]
Update copyright year in files
|
|
peephole.c
|
2023-12-24
|
gatewood
|
[c2e037]
Fix another "leak of file descriptor" error in ...
|
|
peephole.h
|
2021-01-19
|
gatewood <>
|
[56aa70]
Update copyright year in files
|
|
puff.c
|
2023-05-11
|
gatewood
|
[5dc406]
fix typos in comments in puff.c
|
|
puff.h
|
2021-11-22
|
gatewood
|
[7f6d12]
Added Mark Adler's puff.[ch] from zlib-1.2.11 c...
|
|
raw_registers.c
|
2021-01-19
|
gatewood <>
|
[56aa70]
Update copyright year in files
|
|
raw_registers.h
|
2021-01-19
|
gatewood <>
|
[56aa70]
Update copyright year in files
|
|
robert1.c
|
2021-04-21
|
gatewood <>
|
[cabb7c]
Fix bit-shift undefined behavior
|
|
scanner3.c
|
2023-12-24
|
gatewood
|
[e05334]
A 'break;' after ICE is silly, since ICE is nor...
|
|
scanner3.h
|
2021-05-23
|
gatewood <>
|
[4b9f1d]
Implement DATE$ and TIME$ string functions
|
|
semantic_checks.c
|
2023-12-13
|
gatewood
|
[1a1bb9]
Shut the -fanalyzer up about null destination p...
|
|
semantic_checks.h
|
2021-01-19
|
gatewood <>
|
[56aa70]
Update copyright year in files
|
|
sha256.1
|
2021-11-13
|
gatewood
|
[8824d9]
Add simple manpage for sha256 utility
|
|
sha256.c
|
2023-12-16
|
gatewood
|
[39a4ec]
Avoid dynamic memory when not needed (in self-t...
|
|
sha256.h
|
2021-10-31
|
gatewood
|
[3960f9]
Add sha256 utility It generates the same output...
|
|
structure.dot
|
2020-01-13
|
gatewood <>
|
[d46a57]
Remove trailing whitespace
|
|
symbol_table.c
|
2023-12-16
|
gatewood
|
[39a4ec]
Avoid dynamic memory when not needed (in self-t...
|
|
symbol_table.h
|
2021-05-08
|
gatewood <>
|
[215790]
Support PI and MAXNUM in DATA statements when e...
|
|
textdata.s.in
|
2023-12-18
|
gatewood
|
[ddec13]
Simplify linking in of assembly files, removing...
|
|
tree.c
|
2023-11-20
|
gatewood
|
[64d0ab]
Add tree_postorder_rw() for updating AST in-place
|
|
tree.h
|
2023-11-20
|
gatewood
|
[64d0ab]
Add tree_postorder_rw() for updating AST in-place
|
|
zonermore.c
|
2021-11-07
|
gatewood
|
[a2ea54]
Fix a warning about declaration after statement
|
|
zonermore.txt
|
2014-04-04
|
gatewood <>
|
[103458]
documentation update
|
Read Me
[The text in this file will only look correct if you use a fixed-width font]
This software is a compiler for 'Minimal BASIC' as specified by the ECMA-55
standard. The target is AMD64/EM64T/x86-64 machines running a modern Linux
distribution. This compiler will create assembly language output files.
These must be assembled into object files and linked to create an executable.
The assembly dialect used is that of GNU gas, since that will be available on
any modern general purpose x86-64 Linux distribution. No libc or libm is used
by the generated code, which allows creating very small executables. To keep
the generated code small and simple, output of SIN, COS, TAN, ATAN, EXP, POW,
LOG, RND, and RANDOMIZE is only emitted if those features are required by the
input BASIC program.
After completing this project, I did find one other FOSS compiler that claims
to be able to handle much of ANSI Full BASIC, the BASIC Accelerator at
http://hp.vector.co.jp/authors/VA008683/english/BASICAcc.htm, but the output is
Object Pascal for the FreePascal compiler at http://www.freepascal.org/, and
not assembly. Also, they implement only what ECMA-116 calls OPTION ARITHMETIC
NATIVE mode, which is essentially the same mode implemented in this compiler.
The same developers have created an interpreter called Decimal BASIC at
http://hp.vector.co.jp/authors/VA008683/english/ that does attempt to support
the required decimal arithmetic. Strangely, these projects did not turn up in
normal web searches, but only when I searched for "BASIC-1 OPTION ARITHMETIC
DECIMAL".
In 2015 I learned of Jorge Giner Cordero's excellent bas55 interpreter for
ECMA-55 Minimal BASIC which you can download at this URL:
https://jorgicor.niobe.org/bas55
Note that bas55, like most vintage BASIC interpreters, initializes numeric
variable values to zero and does not detect uses of uninitialized variables by
default in batch mode. However, the --debug switch enables detection which
results in warnings for uninitialized variable values. When bas55 is run in
interactive mode, the --debug switch is enabled by default. The ECMA-55
standard states that programs intended to be portable _should_ explicitly
initialize all variables before use, and the ecma55 compiler _requires_ this,
and treats such accesses as fatal errors.
The license for the groff format manual pages and the included book
"An Introduction to Programming with ECMA-55 Minimal BASIC" is the GNU Free
Documentation License version 1.3 only. See the included GNU_FDL for details.
The author of the book, the groff format manual pages, and the actual compiler
software is John Gatewood Ham. The source code for the compiler itself is
available under the GNU General Public License version 2 only. See the
included file COPYING for details.
The following NBS tests were kindly supplied by Emmanuel Roche:
56, 57, 65, 66, 67, 68, 69, 109, 117, 118, 119, 120, 121, 122, 123, and 124.
The rest came from the Google Books PDF files available on the Internet.
Fixes for the following NBS tests were kindly supplied by Jorge Giner Cordero:
12, 14, 25, 39, 43, 74, 108, 115, 128, 185, 191, and 206.
The included runtime library assembly routines for SIN, COS, TAN, ATAN, LOG,
EXP, and POW are from SLEEF-3.4.1 (tweaked), from Naoki Shibata.
https://github.com/shibatch/sleef, and are covered by the Boost Software
License Version 1.0. This is a FOSS license available for download at
http://www.boost.org/LICENSE_1_0.txt, which is included with this software
and called BOOST_LICENSE-1.0.TXT.
The included runtime library assembly routines for RND, and RANDOMIZE are
modified versions of public domain code from ISAAC-64 from Bob Jenkins.
http://burtleburtle.net/bob/rand/isaacafa.html
The included runtime library assembly routines for floating point input and
output are derived from David M. Gay's dtoa.c and g_fmt.c, which are free
to use but not public domain. See the comments in those source files for
details.
http://netlib.sandia.gov/fp/index.html
The included runtime library assembly routines for accessing the Linux
vDSO come from the Linux kernel and are written by Andrew Lutomirski.
That code uses the Creative Commons Zero license for the reference vDSO
parser and the GNU GPL v2.0 only for the stack walking and pointer setup
code run at program startup that uses that reference parser.
The included runtime library assembly routines for accessing the timezone
database to generate correct local date and time values come from David
Olson's tzcode2020f (now maintained by Paul Eggert). The code used (generated
from localtime.c and some definitions from headers) is in the public domain.
The puff.[ch] deflate code was written by Mark Adler and is from the zlib
contrib directory from zlib-1.2.11. I altered it by adding some type casts to
silence some warnings. It uses a custom license that requires attribution, but
the code is free to use for any purpose, and is copyrighted by Mark Adler.
I wrote a special file dumpregs.s for dumping CPU registers while debugging,
and unlike the main compiler, this one file is public domain. The compiler
does not use it, but I used it when debugging programs and include it for
other people who might work at the assembly-language level.
The ECMA-55 standard was chosen over the "ANSI X3.60-1978 minimal BASIC"
standard since it is free. ANSI, despite canceling the standard, still
keeps the ancient standard locked down and available only if you pay
for it, which is a quite mean-spirited attempt to prevent any compliant
free and open source implementations from being written. The same attitude
exists with ISO for the "ISO 6373:1984 Data processing -- Programming
languages -- Minimal BASIC" standard. This standard has many other names,
such as "AS 2797-1985 Programming language - Minimal BASIC", and the only
free one is ECMA-55, since all the other standards bodies are trying to
kill BASIC forever.
http://www.ecma-international.org/publications/files/ECMA-ST-WITHDRAWN/
Files in this distribution.
BOOK/LICENSE
This is the complete text of the GNU Free Documentation License Version
1.3, which is used for the book. It is identical to the GNU_FDL file,
but is bundled with the book for the case when people use the book
independently of the rest of the compiler distribution.
BOOK/Makefile
This is the project build file for creating the book.
BOOK/duplex
This file is used to enable duplex printing of the book.
BOOK/Learn_BASIC.tex
This contains the LaTeX source code of the included book "An Introduction
to Programming with ECMA-55 Minimal BASIC". This file is documentation and
is licensed under the GNU Free Documentation License Version 1.3 only.
BOOK/Learn_BASIC.pdf
This contains the included book "An Introduction to Programming with
ECMA-55 Minimal BASIC". This file is documentation and is licensed under
the GNU Free Documentation License Version 1.3 only.
GNU_FDL
This is the complete text of the GNU Free Documentation License Version
1.3, which is used for the groff format manual pages and the included
book.
COPYING
This contains a copy of the GNU GPL version 2 license for the compiler
itself.
ChangeLog
This contains a high-level overview of changes sorted by time in
ascending date order with the newest changes at the end of the file.
globals.[ch]
This contains global variables that must be shared across all modules.
scanner3.[ch]
This is the new scanner that converts the input byte stream into
tokens for the parser. This uses a hand-coded switch-based scanner.
parser2.[ch]
This contains the parser that uses the token stream created by the
scanner and generates an AST used by semantic_checks and asmgen
modules.
symbol_table.[ch]
This contains the symbol table module.
asmgen.[ch]
This contains the code that walks the AST the parser creates and
generates the assembly using the low-level routines available in
the codegen module. It also uses the raw_registers, optimizer, and
symbol_table modules.
semantic_checks.[ch]
This contains the code that walks the AST the parser creates and
performs semantic checks, symbol table population, and jump target
checking with help from the symbol_table modules.
codegen.[ch]
This contains low-level routines that emit the GAS assembler output. It
also contains some of the runtime functions and macros. The runtime
library code in this file is GPLv2.
main.c
This contains the main routine that calls everything else. It does
the command-line argument processing, loads the input file into a
buffer, calls the scanner to convert that into a token stream, then
calls the parser to process the token stream.
g_fmt_BASIC.s
This contains the assembly code for my tweaked version of
David M. Gay's g_fmt.c file. The process to generate this is in
the magic.txt file in the dgay sub-directory. This is used as part
of a compiled BASIC program's runtime. A tweaked copy of this is
included in the codegen.c file. The runtime library code in this file is
Copyright (C) by Lucent Technologies, but is free to use since it includes
the copyright notice.
dtoa5_normal.[ch]
This contains the C code for my tweaked version of David M. Gay's
dtoa.c file. This is used by the compiler to ensure it formats
numbers in exactly the same format as the runtime. The runtime library
code in this file is Copyright (C) by Lucent Technologies, but is free to
use since it includes the copyright notice. clang versions less than
12.0 won't build this correctly if PIE=1 with large model.
g_fmt_BASIC_normal.[ch]
This contains the C code for my tweaked version of David M. Gay's
g_fmt.c file. This is used by the compiler to ensure it formats
numbers in exactly the same format as the runtime. The runtime library
code in this file is Copyright (C) by Lucent Technologies, but is free to
use since it includes the copyright notice. clang versions less than
12.0 won't build this correctly if PIE=1 with large model.
textdata.s
AMD64/*.s
These files contain assembly language for for various runtime features
and get included in the generated assembly code when needed. The files
in AMD64 get included by the textdata.s file. The files will get
directly linked into the ecma55 executable, so they do not need to be
present for ecma55 to work, they only need to be present when you build
ecma55.
tree.[ch]
This contains the base n-ary tree code. These nodes are used
to create the AST which is as intermediate representation created
by the parser.
raw_registers.[ch]
This contains the register management code.
dag.[ch]
This contains the code to convert an AST into a DAG for arithmetic
expression evaluation.
optimizer.[ch]
This contains the AST optimizer code for optimizing arithmetic
expressions. Currently, the only supported optimization is constant
folding.
ast.[ch]
This contains the code for pretty-printing which traverses the AST that
is generated during a parse and regenerates a semantically equivalent
program. It is used to support the ecma55 compiler's -P and -R options.
Makefile.gcc
This is the project build file for use by the make program for gcc.
Makefile.clang
This is the project build file for use by the make program for clang.
Makefile.tcc
This is the project build file for use by the make program for tcc.
Makefile.runtests
This is the parallel test running harness for the NBS Minimal BASIC
test suite.
Makefile.runtests2
This is the parallel test running harness for the HAM Minimal BASIC
test suite.
grammar.txt
This contains a copy of the Minimal BASIC grammar.
ecma55.1
This is the man page for the compiler. This file is documentation and
is licensed under the GNU Free Documentation License Version 1.3 only.
BASICC.1
This is the man page for the BASICC script. This file is documentation and
is licensed under the GNU Free Documentation License Version 1.3 only.
BASICCS.1
This is the man page for the BASICCS script. This file is documentation and
is licensed under the GNU Free Documentation License Version 1.3 only.
BASICCW.1
This is the man page for the BASICCW script. This file is documentation and
is licensed under the GNU Free Documentation License Version 1.3 only.
ECMA-55.TXT
This file contains the text of the ECMA-55 standard for
"Minimal BASIC". This was retyped by me from the PDF version
both to get a smaller file and to allow easy searching.
BASICC
This file is a script that will compile, assemble, and link
an input program. Note that the input program must have the
extension '.BAS' for this script to work.
BASICCS
This file is a script that will compile, assemble, and link
an input program. Note that the input program must have the
extension '.BAS' for this script to work. This version tells
the compiler to generate 32bit math for the arithmetic
expressions, generating output more closely matching the NBS
Minimal BASIC test suite expectations.
BASICCW
This file is a script that will compile, assemble, and link
an input program. Note that the input program must have the
extension '.BAS' for this script to work. This version tells
the compiler to use 132 column output instead of 80 column
output. The floating point numbers will be displayed with
up to 15 digits with up to 3 digit exponents. The output in
this case does not match the NBS standard's examples. However,
the output does show the floating point values with the greater
precision that 64bit floating math supports.
dumpregs.s
This is an assembler source file you can build and link in to an
executable. It contains 'dumpregs', a procedure that takes no
arguments and returns no values but does dump the registers used
for normal programming for this project, including the xmm registers,
eflags, and mxcsr flags. It does not dump the FP registers or state
since this project uses SIMD exclusively for floating point math.
Unlike the main compiler, this file I wrote is in the public domain.
I hope anybody who needs to code in assembler in 64bit on AMD64/EM64T
in Linux will file it useful.
datum.dot
This is the graphviz dot source file for the diagram of the finite
state machine used by the INPUT runtime subsystem.
parseinput.c
This is the C source code for the INPUT runtime subsystem. Compile
with -DTROUBLE to get a trace of the states as the transitions occur.
zonermore.c
This is the C source code for the PRINT runtime subsystem.
robert1.c
This is the C source code for the RND function and RANDOMIZE
statements. Unlike the compiler itself, this file is in the public
domain and is derived from Bob Jenkin's ISAAC-64 .
http://burtleburtle.net/bob/rand/isaacafa.html
peephole.c
This contains the very simple peephole optimizer code. It reads
the assembly language file generated by the compiler and generates
a new assembly language file. It removes any superfluous
'pushsaddr'/'popsaddr %rdi' sequences.
ECMA55-slideshow.odp
This is a slideshow generated with LibreOffice. It gives a good
overview of this compiler project, including the motivation, the
overall structure, and suggestions for future work.
ECMA55-slideshow.pdf
PDF/1A version of ECMA55-slideshow.odp. This includes the fonts
and should display identically on all machines with a graphical
PDF file viewer.
mathnotes.txt
Notes about possible future work regarding the math code in the
compiler.
parseinput.txt
This file contains the instructions used to create the original
parseinput.s file from the parseinput.c file. The parseinput.s
file is then updated as explained to create the final version of
the routines which were included in the codegen.c file.
zonermore.txt
This file contains the magic incantation used to create the original
zonermore.s file from the zonermore.c file. The zonermore.s was
then edited to produce the final version of the routines which were
included in the codegen.c file.
dgay/magic.txt
This file contains the instructions for generating the assembly language
versions of David M. Gay's code used in the codegen.c file.
GETTING THE CODE
The source code was created and is maintained on a Linux system, and uses
an ASCII encoding and UNIX line endings (0x0A).
If you are reading this you should have a copy of the code from a snapshot or
release tar.xz file. Between snapshots some changes may exist only in the
upstream git repository. If you want to start with the absolutely latest
version from the upstream git repository, you need to do a clone
operation like this:
git clone https://git.code.sf.net/p/buraphakit/MB_git MinimalBASIC
This creates the 'MinimalBASIC' subdirectory which has the code and a local
copy of the upstream repository. After the initial clone, assuming you didn't
modify anything, you can easily use git to stay up-to-date with a 'make
-fMakefile.gcc distclean', followed by a 'git pull',
followed by a 'make canrelease'. The 'make canrelease' for LLVM/clang 12
takes about 15 minutes with -j24 on an AMD(R) Ryzen(TM) 9 3900X @ 3.8Ghz with
64GB RAM and a Samsung 870 EVO 1TB SATA SSD. You cannot push changes upstream
directly with git. If you want to contribute a fix or improvement, please
generate a patch with 'git format-patch' and submit it on the SourceForge site
(Support->Patches).
The SourceForge site for this project has this URL:
http://sourceforge.net/projects/buraphakit/
Information on obtaining and using the git version control software
is available from the git web site which has this URL:
https://www.git-scm.org/
REPORTING BUGS
If you found a bug but do not know how to fix it, please submit a bug report on
the SourceForge site (Support->Bugs). If you know what assembly should be
generated but do not known how to modify the compiler to make that happen,
please include the .BAS program file and the assembly that should have been
generated in the bug report. If you just found a problem but have no idea how
to fix it, please include the .BAS program file in the bug report, and explain
what you think should have happened, and what actually happened instead.
EXAMPLE SESSION:
1. Create source
vi WHATEVER.BAS
2. Compile it
./BASICC WHATEVER.BAS
3. Run it
./WHATEVER
You can optionally strip the WHATEVER executable with the 'strip' command and
it will still work, and it will probably be (slightly) smaller.
NOTES ON BUILDING THE COMPILER:
To perform an overall check of the compiler using gcc to build it, you should
do something like this:
make -Otarget -fMakefile.gcc distclean
make -Otarget -j -l12 -fMakefile.gcc canrelease PIE=1 LTO=1 2>&1 | tee log.gcc
That example is tuned for a 12 core Zen 2 machine with an x86_64 AMD Ryzen 9
3900X 12-Core processor with hyper-threading disabled in the BIOS. The -l
should be the number of cores, and the -j should not have a number. On an
Intel i7-4790 4-Core processor with hyper-threading disabled in the BIOS, I use
-j -l4, for instance. You can also use Makefile.clang or Makefile.tcc which
use the clang and tcc compilers respectively.
Should you want to do a leak check after modifying the code, here is
what you would do:
make -fMakefile.gcc distclean all COMPILE_MODE=DEBUG2 PIE=0 LTO=0
valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all \
--redzone-size=128 --read-var-info=yes --leak-resolution=high \
--track-origins=yes --malloc-fill=FF --free-fill=AA \
--num-callers=40 ./ecma55 -v BAD.BAS
Some older clang compiler versions generate code that results in valgrind
giving a horrible tombstone at the start talking about a DIE it cannot parse,
so use gcc if that happens to you.
At this time (December 2023), current tcc does not produce code that
valgrind 3.22.0 can understand completely, and valgrind always says there
are leaks. I do not know whether this is a problem with tcc or valgrind yet.
If you want to check for memory problems, you should use the address sanitizer
which is supported by clang and gcc. You need to rebuild like this for gcc:
make -fMakefile.gcc distclean
make -fMakefile.gcc all COMPILE_MODE=DEBUG PIE=0 LTO=0
That is an unoptimized build. You can do this if you want optimization AND the
address sanitizer for clang:
make -fMakefile.clang distclean
make -fMakefile.clang all COMPILE_MODE=ASAN PIE=0 LTO=0
Some bugs only show up with ASAN, but debugging is harder with the optimized
code when you are in gdb. If you are having trouble, I suggest trying DEBUG
first, and if that isn't enough then try ASAN.
If you are using some older version of clang, then to test the compiler on some
file BOGUS.BAS after you compiled with the sanitizer support, you would do this
(this style of setting environment variables will work in any reasonably
current version of bash shell):
ASAN_SYMBOLIZER_PATH=/path/to/llvm-symbolizer \
ASAN_OPTIONS="symbolize=1,detect_odr_violation=0" \
UBSAN_OPTIONS=print_stacktrace=1 \
./ecma55 -v BOGUS.BAS
You will need to adjust the symbolizer path to match your system and clang
compiler version.
Any gcc older than 6.2 should really not be used. Versions 6.2 and later have
good address sanitizer support and no environment variable settings are
required; it "just works".
LLVM/clang < 8.0.0 is not supported.
To build a production version with gcc, just do this (adjust the 4 to match
the number of cores in your CPU):
make -Otarget -fMakefile.gcc distclean
make -Otarget -j -l4 -fMakefile.gcc all PIE=1 LTO=1
Alternatively, to build a production version with clang, just do:
make -Otarget -fMakefile.clang distclean
make -Otarget -j -l4 -fMakefile.clang all PIE=1 LTO=1
The tcc compiler never has any actual releases. They don't have snapshots
either. If you want to use tcc, build tcc from git. I have tested with
revision 48798969c558975a78f6441c2f287483436e12d9 successfully. You will need
to use git, fetch that revision, and build tcc yourself and then install it.
Once you are sure tcc works, you can try using tcc with this project like this:
make -Otarget -fMakefile.tcc distclean
make -Otarget -j -l4 -fMakefile.tcc ecma55
Note that while tcc does not document the fact in any place I could find,
according to Michael Matz on the tinycc-devel mailing list, tcc does not
support any code model except small.
The tcc compiler does not support the address sanitizer features, but it is
good for ensuring that you haven't used any horribly non-portable features in
the C code for the compiler itself.
If you have clang, you can perform a static analysis of the C code with the
clang static analyzer like this:
make -Otarget -fMakefile.clang distclean
scan-build make -fMakefile.clang ecma55 PIE=0 LTO=0
Do not specify a COMPILE_MODE, and remember that scan-build only works with
clang. If problems are found, the analyzer program output tells you how to
read the test results.
Another option is to use cppcheck, but you need at least version 1.81, and
on some systems that means you must build it yourself and install it in
/usr/local. Once you have a new enough version of cppcheck, you can run the
cppcheck static analyzer like this:
make -Otarget -fMakefile.gcc distclean
cppcheck --force *.[ch]
The COMPILE_MODE= switch can be any of these:
ASAN optimized build with address and undefined behavior sanitizers
DEBUG unoptimized build with address and undefined behavior sanitizers
for gcc/llvm, bounds checking for tcc
DEBUG2 unoptimized build without sanitizers for use with gdb and valgrind
DEBUG3 unoptimized build with custom memory interceptor for tracing every
single allocate/deallocate - this makes the programs run extremely
slowly, but can be helpful when you just cannot track down a
memory problem. In addition to detecting memory leaks, it will
also hexdump every single byte of leaked memory if the program
terminates successfully.
If you do not set COMPILE_MODE, you get an optimized build with no debugging
information and no sanitizers.
The distclean target on the Makefiles removes all generated files. If you
change compilers or the COMPILE_MODE, you need to rebuild everything, and
distclean makes that easy. There is no default compiler so you need to
specify what Makefile you want with the -f flag. For most people, gcc
is the reasonable choice and you just do this:
make -Otarrget -j -l4 -fMakefile.gcc all
SOFTWARE USED FOR BUILDING AND TESTING
On Ubuntu-21.04 on a 64 bit x86-64 Linux system:
GNU sed 4.7, Linux kernel 5.11.0-22, git 2.30.2, ghostscript 9.53.3,
GNU binutils-2.36.1, cppcheck-2.3, gcc 10.3.0, bash-5.1.4, clang 12.0.0,
grep 3.6, qpdf-10.3.1, gzip 1.10, mandoc 1.14.5, and texlive 2020.
On a custom from-scratch 64 bit x86-64 Linux system:
GNU sed 4.9, Linux kernel 6.6.7 (vanilla+ipset-7.19), git 2.43.0, GNU
binutils-2.41, ghostscript 10.02.1, cppcheck-2.12.1, gcc-13.2.0, bash 5.2.21,
clang 17.0.6, grep 3.11, qpdf-11.6.3, gzip 1.13, mandoc 1.14.6, and texlive
2023.
OTHER TOOLS USED
glibc 2.38, GNU make 4.4.1, groff 1.23.0, valgrind-3.22.0,
and patched tcc from git 48798969c558975a78f6441c2f287483436e12d9.
NOTES
The gcc-11.2.0 compiler's static analysis reports a memory leak in optimizer.c
that, while genuine, is on a fatal error path. In the case of a fatal error,
the compiler does not attempt to free all dynamically allocated memory but
instead just aborts, _by design_. The analyzer doesn't stop for noreturn
functions either, which is arguably an imperfection in the static analyzer.
It is safe to ignore these reported errors:
optimizer.c:442:5: warning: leak of 'xstrdup ("optimizer.c", &__func__, 441, &tbuf)' [CWE-401]
optimizer.c:389:5: warning: leak of 'xstrdup ("optimizer.c", &__func__, 388, &tbuf)' [CWE-401]
optimizer.c:273:5: warning: leak of 'xstrdup ("optimizer.c", &__func__, 272, &tbuf)' [CWE-401]
optimizer.c:131:5: warning: leak of 'xstrdup ("optimizer.c", &__func__, 130, &tbuf)' [CWE-401]
BUILD TWEAKS
When using gcc or clang, to get a PIE (position independent executable) program, use
PIE=1 on your command line. To use LTO (link time optimization), use LTO=1 on
your command line. The linker used for gcc is now gold by default, and the linker
for clang is lld. The tcc compiler does not support PIE or LTO. Some
examples will help:
make -Otarget -fMakefile.gcc PIE=1 LTO=1
That will use gcc, build a PIE executable, and will use link time
optimization.
make -Otarget -fMakefile.clang PIE=0 LTO=1
That will use clang, build a normal executable, and will use link time
optimization. WARNING: The clang compiler generates bad PIE code for versions
before 12.0.
The default values for PIE and LTO depend on how the C compilers were built.
If it was configured to generate PIE by default, PIE will be 1 by default,
otherwise it will be zero. LTO always defaults to zero for both gcc and clang.
The mold linker works with gcc, as long as you have mold version 1.11.0 or newer.
However, it works better with mold version 2.3.0 or newer.
IMPLEMENTATION-DEFINED FEATURES
ACCURACY is about 15 digits of precision
IEEE754 double, as implemented by Intel/AMD CPUs.
With -s switch, about 7 digits of precision
IEEE754 single, as implemented by Intel/AMD CPUs.
END OF LINE = ASCII value 10
SIGNIFICANCE-WIDTH = 7
With -w switch, 18
EXRAD-WIDTH = 2
With -w switch, 3
INITIAL VALUE OF VARIABLES
numeric variables are initialized to SNaN (signaling Not-A-Number) and will
force an exception if they are read before they are written.
string variables are initialized to an ASCII 21 byte, followed by
"uninitialized", and then 4 ASCII 0 bytes. The 21 will force an
exception if they are read before they are written.
INPUT-PROMPT = "? "
LONGEST STRING THAT CAN BE RETAINED = 18
VALUE OF MACHINE INFINITESIMAL = 2E-1074 (denormal), 2E-1022 (normal)
With -s switch,
2E-149 (denormal), 2E-126 (normal)
VALUE OF MACHINE INFINITY = +/- Infinity (Intel CPU has special values for
this)
MARGIN = 80
with -w switch, 132
INPUT_WIDTH = 72
This can be changed with "make -fMakefile.gcc distclean all
CPPFLAGS='-DINPUT_WIDTH=256'" where MAXCOLUMN=INPUT_WIDTH+1, but this
breaks NBS test #202 and makes the resulting compiler not strictly
ECMA-55 compliant.
PRECISION is 15 digits of precision
IEEE754 double, as implemented by Intel/AMD CPUs.
With -s switch, about 7 digits of precision
IEEE754 single, as implemented by Intel/AMD CPUs.
PRINT ZONE WIDTH = 15
With -w switch, 26
PSEUDO-RANDOM NUMBER SEQUENCE is from ISAAC-64, see robert1.c for details.
BATCH MODE INPUT uses standard UNIX redirection of STDIN
OUTPUT WIDTH = 80 columns
With -w switch, 132 columns
MAXIMUM ARRAY SUBSCRIPT VALUE = 10000000
NOTES:
1) The implementation-defined numeric functions use doubles, not singles,
in their internal representation, even with the -s switch.
2) OUTPUT WIDTH is sometimes called 'margin' in the ECMA-55 standard.
DOCUMENTED BEHAVIOR
1. Attempts to use the value of uninitialized variables will result in
a fatal exception 'READ OF UNINITIALIZED VARIABLE'.
DOCUMENTED EXTENSIONS ACTIVATED WITH -X OPTION
1. Lower-case letters, backslash, and the characters in "[]{}|@\~`" are
permitted within a quoted string.
2. Lower-case letters, backslash, and the characters in "[]{}|@\~`" are
permitted in a REM statement after the REM keyword.
3. Support for AND, OR, and NOT in conditional expressions.
4. Support for EXIT FOR statement.
5. If both -X and -O3 are specified, DAG optimization is used on
expressions which might alter numerical results slightly because some
redundant subexpressions are only evaluated once, but it should provide
better runtime performance and work well in most cases. It is protected
by the -X switch and the default behavior of the compiler remains
conservative.
6. LEN() function which takes a string variable or string literal value and
returns the number of ASCII characters as an integer (but stored in the
usual floating point format used for all numeric values) is supported.
7. String comparison is extended to support { '<', '<=', '>', '>=' }.
8. ACOS(), ASIN(), CEIL(), DEG(), FP(), IP(), LOG2(), LOG10(), MAX(), MIN(),
MOD(), PI, RAD(), MAXNUM, REMAINDER(), COSH(), SINH(), TANH(), SEC(),
CSC(), COT(), ROUND(), TRUNCATE(), and ANGLE() functions from the ECMA-116
Full BASIC standard are supported.
TEST MACHINE INFORMATION
Original development of the 1.X versions was done with Fedora 20 64bit on an
Intel(R) Core(TM) 2 Duo E4700 @ 2.6Ghz machine. Most modern testing is done
on a Linux-from-scratch descended 64bit machine with an AMD(R) Ryzen(TM) 9 3900X
CPU @ 3.8Ghz. Occasional testing is done with Ubuntu 22.04 inside both QEMU
and LVM2 virtualization on a Windows 11 machine. Surprisingly, the Unbuntu
inside LVM2/Windows testing caused some issues with stdout/stderr to manifest,
so that testing was indeed valuable. The ecma55 code is regularly tested with
gcc, clang, and tcc.
OBTAINING SOFTWARE
You need at least one working compiler and the GNU binutils. Almost every
Linux distribution will work as is if you choose the gcc compiler. If you want
to use clang, many distributions have packages you can install. For tcc,
you really need to build it from source with the version noted
elsewhere in this file. For the manual pages you can use mandoc as an
alternative to groff. If you really want to modify or rebuild the included
book, you will need a complete TeXlive installation. The shell scripts in this
project require bash a 4.x shell, but you know you really should be using the
current bash 5.X, right?
+------------+-----------------------------------------------------------------+
|software | Where to get sources |
+------------+-----------------------------------------------------------------+
|bash | https://ftp.gnu.org/gnu/bash/ |
|binutils | https://ftp.gnu.org/gnu/binutils/ |
| | This has the required assembler and recommended linker |
| +-----------------------------------------------------------------+
|Boost Software License |
| http://www.boost.org/LICENSE_1_0.txt |
| +-----------------------------------------------------------------+
|clang | http://llvm.org/ |
|coreutils | https://ftp.gnu.org/gnu/coreutils/ |
| | for cut, head, sort, wc, etc. |
|cppcheck | http://sourceforge.net/projects/cppcheck/ |
| +-----------------------------------------------------------------+
|Creative Commons Zero License |
| http://creativecommons.org/publicdomain/zero/1.0/legalcode |
|Actual upstream text file of license is _very_ hard to find on their website: |
| https://creativecommons.org/publicdomain/zero/1.0/legalcode.txt |
| +-----------------------------------------------------------------+
|diffutils | https://ftp.gnu.org/gnu/diffutils/ |
|dtoa/g_fmt | http://www.netlib.org/fp/ |
|FDL 1.3 | http://www.gnu.org/licenses/ |
|file | ftp://ftp.astron.com/pub/file/ |
|gcc | https://ftp.gnu.org/gnu/gcc/ |
|ghostscript | https://github.com/ArtifexSoftware/ghostpdl-downloads/releases |
|git | https://www.git-scm.org/ |
|glibc | https://ftp.gnu.org/gnu/glibc/ |
|GPLv2 | http://www.gnu.org/licenses/ |
|grep | https://ftp.gnu.org/gnu/grep/ |
|groff | https://ftp.gnu.org/gnu/groff/ |
|gzip | https://ftp.gnu.org/gnu/gzip/ |
|ISAAC64 | http://burtleburtle.net/bob/rand/isaacafa.html |
|make | https://ftp.gnu.org/gnu/make/ |
|mandoc | http://mdocml.bsd.lv/ |
|mold | https://github.com/rui314/mold |
|musl | http://www.musl-libc.org/ |
|qpdf | http://qpdf.sourceforge.net/ |
| | https://github.com/qpdf/qpdf/ |
|sed | https://ftp.gnu.org/gnu/sed/ |
|SLEEF | https://github.com/shibatch/sleef |
|tar | https://ftp.gnu.org/gnu/tar/ |
|tcc | git://repo.or.cz/tinycc.git |
| | Yeah, you have to pull from git for this. They NEVER have any |
| | formal releases, ever. |
|texlive | http://tug.org/texlive/ |
|tzcode | ftp://ftp.iana.org:/tz/releases/ |
| | or |
| | http://www.iana.org/time-zones |
|unifdef | http://dotat.at/prog/unifdef |
|valgrind | http://www.valgrind.org/ |
|zlib | http://www.zlib.org/ |
+------------+-----------------------------------------------------------------+
TESTING
To run a complete regression test, use the 'canrelease' target and specify
the C compiler you want to use, like this for gcc on a 4-core machine:
make -Otarget -j -l4 -fMakefile.gcc canrelease PIE=0 LTO=0 2>&1 | \
tee logfile.gcc.nopie.nolto
and just change the Makefile.gcc to try any of the other two supported
compilers (leaving out the PIE and LTO switches for tcc, and
making sure PIE=0 for clang versions less than 12.0.0).
You need to have valgrind (3.11.0 or newer) installed if you want the leak
testing to work. You need to have modern gcc (>=10.x), and modern llvm/clang
(>=11.0.0 or newer) for the address and undefined behavior address sanitizers
to work. The llvm/clang people actually modified the assembly dialect, so
you must use the llvm-mc assembler. The Makefile.clang takes care of
this for versions 8 through 17, but for any other versions you would need to
modify the Makefile.clang yourself. You may need to adjust the
ASAN_SYMBOLIZER_PATH to be correct for your clang installation, since it can
vary depending on which version of clang and which Linux distribution you use.
TESTING OPTIMIZATIONS
The compiler now includes some simple optimizations. To see their effect, one
needs a long-running program with some loops. Optimization level zero means
disable optimizations. Optimization level one does constant folding on the
expression tree. Optimization level two switches to a DAG and removes common
sub-expressions. Optimization level three, only available when also specifying
the -X switch, will do some simple algebraic simplifications on the DAG like
detecting C-C and replacing it with zero. At this time, all optimizations are
local to an individual arithmetic expression. Still, speedups can be seen
using this simple sequence:
cp tests/ADDBENCH.BAS .
./ecma55 -O0 ADDBENCH.BAS -o ADDBENCH.BAS.s.O0
./ecma55 -O1 ADDBENCH.BAS -o ADDBENCH.BAS.s.O1
./ecma55 -O2 ADDBENCH.BAS -o ADDBENCH.BAS.s.O2
./ecma55 -O3 -X ADDBENCH.BAS -o ADDBENCH.BAS.s.O3
as ADDBENCH.BAS.s.O0 -o ADDBENCH0.o
as ADDBENCH.BAS.s.O1 -o ADDBENCH1.o
as ADDBENCH.BAS.s.O2 -o ADDBENCH2.o
as ADDBENCH.BAS.s.O3 -o ADDBENCH3.o
ld -nostdlib -z defs -z nodefaultlib -z nodlopen -z noexecstack -Bstatic \
--no-omagic -m elf_x86_64 -o ADDBENCH0 ADDBENCH0.o
ld -nostdlib -z defs -z nodefaultlib -z nodlopen -z noexecstack -Bstatic \
--no-omagic -m elf_x86_64 -o ADDBENCH1 ADDBENCH1.o
ld -nostdlib -z defs -z nodefaultlib -z nodlopen -z noexecstack -Bstatic \
--no-omagic -m elf_x86_64 -o ADDBENCH2 ADDBENCH2.o
ld -nostdlib -z defs -z nodefaultlib -z nodlopen -z noexecstack -Bstatic \
--no-omagic -m elf_x86_64 -o ADDBENCH3 ADDBENCH3.o
time -p ./ADDBENCH0
time -p ./ADDBENCH1
time -p ./ADDBENCH2
time -p ./ADDBENCH3
On an Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz, the times for the benchmark
programs are as follows:
$ time -p ./ADDBENCH0
*** TEST PASSED ***
real 261.00
user 260.71
sys 0.01
$ time -p ./ADDBENCH1
*** TEST PASSED ***
real 129.84
user 129.84
sys 0.00
$ time -p ./ADDBENCH2
*** TEST PASSED ***
real 104.05
user 104.04
sys 0.00
$ time -p ./ADDBENCH3
*** TEST PASSED ***
real 91.31
user 91.31
sys 0.00
The constant folding at optimization level one is the most effective, but each
increasing level provides some improvement. The benchmark in question is
rather contrived, and real-world programs would not see such dramatic
improvements, but still should have improved run times compared to unoptimized
programs.
IMPLEMENTATION NOTES
To use extensions, the -X option must be specified. When using the easy
wrappers, you can specify the switches with the ECMA55FLAGS environment
variable. For instance, to specify that you want extensions and you want
SSE4.1 instructions to be used when compiling a file DEMO.BAS, you would
do this:
ECMA55FLAGS='-X -4' ./BASICC DEMO.BAS
The scanner adjusts itself automatically if necessary for accepting extensions.
The parser then is run to create an abstract syntax tree (AST) in a second
pass, calling for tokens from the scanner as required.
This tree is then walked once to populate line number information, again to
convert that to a DAG (just for jump targets), again to populate the symbol
table with information about the variables, optionally optimization pass
for constant folding and/or DAG conversion, and finally another traversal
is used to generate the assembly code.
Register allocation for arithmetic expression is done on an expression by
expression process as part of the code generation. It is not sophisticated.
If the -P option is used, then instead of the three tree walks just described,
a different walk is done that regenerates the source code.
If the -X option is used, extensions are accepted. These include allowing
lower-case letters in comments and strings, supporting AND, OR, and NOT in
conditional expressions, and supporting the EXIT FOR statement, and many
mathematical functions from ECMA-116 Full BASIC.
On any internal compiler error (ICE), the compiler will abort. This compiler
does not attempt to continue after an error is encountered.
PUBLICATIONS
The file computers-03-00069.pdf contains a PDF version of the file from
http://www.mdpi.com/2073-431X/3/3/69
which documented version 1.7 of the compiler. The paper appeared in the MDPI
Computers journal:
Ham, John G. 2014. "An ECMA-55 Minimal BASIC Compiler for x86-64 Linux®."
Computers 3, no. 3: 69-116.
MISCELLANEOUS
make -Otarget -j -l4 -fMakefile.gcc distclean all COMIPLE_MODE=DEBUG2 PIE=0 LTO=0
This compiles for full debugging without the address sanitizer. Use this
if you plan to use gdb or valgrind.
make -Otarget -j -l4 -fMakefile.gcc distclean all COMIPLE_MODE=DEBUG3 PIE=0 LTO=0
This compiles like DEBUG2, but switches to the intercepted mymalloc(),
myfree(), etc. in globals.c for heavy-duty debugging. Using the -v option
to ecma55 will trigger output of every single allocation and deallocation,
and also print a list of anything that was not deallocated, and it will
dump (in hexadecimal) every allocated byte. The output is of course huge and
slow, so you should redirect to a file when you run, like this:
./ecma55 -v WHATEVER.BAS >logfile 2>&1
This mode is very slow so should only be used if you are working on ecma55
itself and have memory leak problems detected by ASAN or valgrind that you
could not find by just staring at the code. In other words, this is a last
resort. Remember that this compiler does not even attempt to clean up memory
if it has to abort, which occurs when you do something silly like use
lower-case letters in a program without specifying the -X switch, or forgetting
the LET keyword on an assignment statement.
If you need more debugging information than the -v option provides, recompile
with CPPFLAGS=-DDEEP_DEBUG, but be aware that -v is then even more verbose.
Frequently Asked Questions
* My whatever.bas file won't compile. Why not?
You must use an upper-case '.BAS' suffix on the file name. You must not have
any spaces in the file name. You really should ensure the filename uses only
7-bit ASCII characters in its name.
* How can I renumber my program?
If a program WHATEVER.BAS compiles and runs without errors, then you can
generate a semantically equivalent renumbered version like this:
./ecma55 -R -o WHATEVER2.BAS WHATEVER.BAS
The renumbered program is in the WHATEVER2.BAS file.
* Lower-case letters in my source code don't work!
The ECMA-55 standard for Minimal BASIC does not permit lower-case letters.
Now you know why there is a caps lock key on your keyboard. You can use
lower-case letters inside of quoted strings or in REM statements if you
specify the -X option to the compiler to enable extensions, but even then
all keywords must be in upper case.
* I want to support the whatever compiler. What do I do?
Copy Makefile.gcc to Makefile.whatever, update the Makefile, and then use
-fMakefile.whatever instead of -fMakefile.gcc when you use the make program.
You should expect to work a little bit at finding the right combination of
options to the compiler, assembler, and linker for your toolchain. Also,
please be aware that you may have trouble with things like __attribute__(),
inline assembly, etc. Some assumptions of this software you should be aware
of if you attempt to use an unsupported toolchain are:
1. text files are 7-bit ASCII, use UNIX newlines (0x0A), and do not have BOM
markers
2. paths and filenames do not include any spaces, tabs, or punctuation
except for periods and underscores, and cannot begin with a period.
3. the ulimits must not be unreasonably small
4. your toolchain can process the C11 dialect of C and is for Linux
(POSIX support, etc.)
5. your toolchain provides command-line tools
6. your assembler can process GNU's version of AT&T syntax
7. your assembler includes a macro processor compatible with GNU gas's
macro processor
8. your linker supports ELF64
9. you really use GNU make with a version >= 4.x
10. you really use bash with a version >= 4.x, not csh, tcsh, ash, dash,
pdksh, etc. As of version 2.28, you now __can__ build ecma55 using dash
or ksh, but keep in mind the self-tests absolutely require a modern
version of bash.
* I want to build a static version of the compiler with musl-libc. What do I
need to do?
# make sure Makefile.gcc is using the gold linker
make -Otarget -fMakefile.gcc distclean
make -Otarget -fMakefile.gcc PIE=0 LTO=0 COMPILE_MODE=DEBUG2 CC=musl-gcc \
LDFLAGS=-static
strip ecma55
strip -R .comment ecma55
strip -R .note.gnu.gold-version ecma55
This has been tested with binutils-2.40, gcc-13.1.0, and musl-1.2.4 versions.
Note that without the LDFLAGS=-static, it doesn't work on Ubuntu 19.10...
* I want a PDF of the manual page. What do I need to do?
1. If you are using groff, try this:
groff -man -T pdf -P-pa4 ecma55.1 >ecma55.1.pdf
2. If you are using mandoc, try this:
mandoc -man -T pdf -O paper=a4 ecma55.1 >ecma55.1.pdf
Obviously if you use letter paper (U.S.A.), change a4 to letter instead. In
my opinion, the groff output looks better, but the mandoc output is
servicable.
* String support is awful. What do I need to do?
This compiler implements the ECMA-55 Minimal BASIC standard, which does not
support strings well. This is a problem with the BASIC dialect, and not the
compiler implementation. If you need reasonable string data support for your
program, then ECMA-55 Minimal BASIC is the wrong language to use for
implementing that program.