Git Product home page Git Product logo

21sh's People

Contributors

dsotomay avatar travmatth avatar

Watchers

 avatar  avatar

21sh's Issues

2.7.6 Duplicating an Output File Descriptor

The redirection operator:
[n]>&word
shall duplicate one output file descriptor from another, or shall close one. If word evaluates to one or more digits, the file descriptor denoted by n, or standard output if n is not specified, shall be made to be a copy of the file descriptor denoted by word; if the digits in word do not represent a file descriptor already open for output, a redirection error shall result; see Consequences of Shell Errors. If word evaluates to '-', file descriptor n, or standard output if n is not specified, is closed. Attempts to close a file descriptor that is not open shall not constitute an error. If word evaluates to something else, the behavior is unspecified.

fix field splitting

fields including quotes should include them whole, not split on quotations

Refactor heredoc

  • function call to find optimal buff size
  • dfa generator for here_end detection
  • turn off icanon ? need prompt after newlines
  • turn on isig -> need to catch ctrl-c in input
  • no EOT signal -> find EOF directly

Implement Pipes

Pipe operator | should work

  • needs to exit on failed pipe?
  • how will this interact w/ redirections?

2.6.1 Tilde Expansion

  • Fifth step of post lexer processing
  • Need to implement

A "tilde-prefix" consists of an unquoted character at the beginning of a word, followed by all of the characters preceding the first unquoted in the word, or all the characters in the word if there is no . In an assignment (see XBD Variable Assignment), multiple tilde-prefixes can be used: at the beginning of the word (that is, following the of the assignment), following any unquoted , or both. A tilde-prefix in an assignment is terminated by the first unquoted or . If none of the characters in the tilde-prefix are quoted, the characters in the tilde-prefix following the are treated as a possible login name from the user database. A portable login name cannot contain characters outside the set given in the description of the LOGNAME environment variable in XBD Other Environment Variables. If the login name is null (that is, the tilde-prefix contains only the tilde), the tilde-prefix is replaced by the value of the variable HOME. If HOME is unset, the results are unspecified. Otherwise, the tilde-prefix shall be replaced by a pathname of the initial working directory associated with the login name obtained using the getpwnam() function as defined in the System Interfaces volume of POSIX.1-2017. If the system does not recognize the login name, the results are undefined.

The pathname resulting from tilde expansion shall be treated as if quoted to prevent it being altered by field splitting and pathname expansion.

Correctly process here_end in redir_utils

If any part of word is quoted, the delimiter shall be formed by performing quote removal on word, and the here-document lines shall not be expanded. Otherwise, the delimiter shall be the word itself.

If no part of word is quoted, all lines of the here-document shall be expanded for parameter expansion, command substitution, and arithmetic expansion. In this case, the in the input behaves as the inside double-quotes (see Double-Quotes). However, the double-quote character ( ' )' shall not be treated specially within a here-document, except when the double-quote appears within "$()", "``", or "${}".

Accessing execve over separate tty doesnt set stat_loc correctly

https://unix.stackexchange.com/questions/385771/writing-to-stdin-of-a-process
https://serverfault.com/questions/178457/can-i-send-some-text-to-the-stdin-of-an-active-process-running-in-a-screen-sessi
https://stackoverflow.com/questions/1312922/detect-if-stdin-is-a-terminal-or-pipe
https://blog.habets.se/2009/03/Moving-a-process-to-another-terminal.html
https://home.adelphi.edu/~pe16132/csc271/ppt/summaries/ProcessesAndSignals.htm
https://unix.stackexchange.com/questions/170063/start-a-process-on-a-different-tty
https://en.wikipedia.org/wiki/Process_group
https://unix.stackexchange.com/questions/170063/start-a-process-on-a-different-tty/170075
http://www.cs.uleth.ca/~holzmann/C/system/pipeforkexec.html
https://www.gnu.org/software/libc/manual/html_node/Initializing-the-Shell.html
https://stackoverflow.com/questions/51532911/are-stdin-and-stdout-actually-the-same-file/51533009#51533009
https://stackoverflow.com/questions/36552645/how-assign-a-new-terminal-window-to-each-child-process
https://unix.stackexchange.com/questions/117981/what-are-the-responsibilities-of-each-pseudo-terminal-pty-component-software
https://unix.stackexchange.com/questions/93531/what-is-stored-in-dev-pts-files-and-can-we-open-them
https://unix.stackexchange.com/questions/260503/how-do-i-run-a-command-in-a-different-tty
https://unix.stackexchange.com/questions/148404/how-does-bash-actually-change-stdin-stdout-stderr-when-using-redirection-piping
https://unix.stackexchange.com/questions/58445/how-can-i-launch-applications-from-2-ttys-on-launch
https://unix.stackexchange.com/questions/72320/how-can-i-hook-on-to-one-terminals-output-from-another-terminal

2.6.3 Command Substitution

  • Seventh step of post lexer processing
  • Need to decide extent to which should be implemented vs stubbed

Command substitution allows the output of a command to be substituted in place of the command name itself. Command substitution shall occur when the command is enclosed as follows:

$(command)

or (backquoted version):

command

The shell shall expand the command substitution by executing command in a subshell environment (see Shell Execution Environment) and replacing the command substitution (the text of command plus the enclosing "$()" or backquotes) with the standard output of the command, removing sequences of one or more characters at the end of the substitution. Embedded characters before the end of the output shall not be removed; however, they may be treated as field delimiters and eliminated during field splitting, depending on the value of IFS and quoting that is in effect. If the output contains any null bytes, the behavior is unspecified.

Within the backquoted style of command substitution, shall retain its literal meaning, except when followed by: '$', '', or <backslash>. The search for the matching backquote shall be satisfied by the first unquoted non-escaped backquote; during this search, if a non-escaped backquote is encountered within a shell comment, a here-document, an embedded command substitution of the $(command) form, or a quoted string, undefined results occur. A single-quoted or double-quoted string that begins, but does not end, within the "...`" sequence produces undefined results.

With the $(command) form, all characters following the open parenthesis to the matching closing parenthesis constitute the command. Any valid shell script can be used for command, except a script consisting solely of redirections which produces unspecified results.

The results of command substitution shall not be processed for further tilde expansion, parameter expansion, command substitution, or arithmetic expansion. If a command substitution occurs inside double-quotes, field splitting and pathname expansion shall not be performed on the results of the substitution.

Command substitution can be nested. To specify nesting within the backquoted version, the application shall precede the inner backquotes with characters; for example:

`command`

The syntax of the shell command language has an ambiguity for expansions beginning with "$((", which can introduce an arithmetic expansion or a command substitution that starts with a subshell. Arithmetic expansion has precedence; that is, the shell shall first determine whether it can parse the expansion as an arithmetic expansion and shall only parse the expansion as a command substitution if it determines that it cannot parse the expansion as an arithmetic expansion. The shell need not evaluate nested expansions when performing this determination. If it encounters the end of input without already having determined that it cannot parse the expansion as an arithmetic expansion, the shell shall treat the expansion as an incomplete arithmetic expansion and report a syntax error. A conforming application shall ensure that it separates the "$(" and '(' into two tokens (that is, separate them with white space) in a command substitution that starts with a subshell. For example, a command substitution containing a single subshell could be written as:

$( (command) )

Duplicate input fd

The redirection operator:
[n]<&word
shall duplicate one input file descriptor from another, or shall close one. If word evaluates to one or more digits, the file descriptor denoted by n, or standard input if n is not specified, shall be made to be a copy of the file descriptor denoted by word; if the digits in word do not represent a file descriptor already open for input, a redirection error shall result; see Consequences of Shell Errors. If word evaluates to '-', file descriptor n, or standard input if n is not specified, shall be closed. Attempts to close a file descriptor that is not open shall not constitute an error. If word evaluates to something else, the behavior is unspecified.

Manage missing quotes

Commands entered that are missing quote characters should prompt user to enter more input

2.4: Implement Reserved Words (stub)

  • After alias substitution, second step of post lexing processing
  • Need to decide on extent of implementation vs stubbing, detection of these words may expand number of tokens to be appended to token stream.
! { } case do done elif else esac fi for if in then until while

This recognition shall only occur when none of the characters is quoted and when the
word is used as:
- The first word of a command
- The first word following one of the reserved words other than case, for, or in
- The third word in a case command (only in is valid in this case)
- The third word in a for command (only in and do are valid in this case)
See the grammar in Shell Grammar.
The following words may be recognized as reserved words on some implementations (when
none of the characters are quoted), causing unspecified results:
[[  ]] function select

Ability to use outside ttyid as native

Should use current tty and use as basis for deriving stdin/stdout/stderr, since each aliases same fd (for given tty). Should be able to pass ttyid to use as basis for shell functionality

2.6.2 Parameter Expansion

  • Sixth step of post lexer processing
  • Four types of expansions available, all will behave differently depending on the state of word's initiliaztion
  • Refer to source section 2.62 for specifics of each expansion

2.6.4 Arithmetic Expansion

  • Eighth step of post lexer processing
  • Need to decide to what extent to implement vs stub

Arithmetic expansion provides a mechanism for evaluating an arithmetic expression and substituting its value. The format for arithmetic expansion shall be as follows:
$((expression))
The expression shall be treated as if it were in double-quotes, except that a double-quote inside the expression is not treated specially. The shell shall expand all tokens in the expression for parameter expansion, command substitution, and quote removal.
Next, the shell shall treat this as an arithmetic expression and substitute the value of the expression. The arithmetic expression shall be processed according to the rules given in Arithmetic Precision and Operations, with the following exceptions:
Only signed long integer arithmetic is required.
Only the decimal-constant, octal-constant, and hexadecimal-constant constants specified in the ISO C standard, Section 6.4.4.1 are required to be recognized as constants.
The sizeof() operator and the prefix and postfix "++" and "--" operators are not required.
Selection, iteration, and jump statements are not supported.
All changes to variables in an arithmetic expression shall be in effect after the arithmetic expansion, as in the parameter expansion "${x=value}".
If the shell variable x contains a value that forms a valid integer constant, optionally including a leading or , then the arithmetic expansions "$((x))" and "$(($x))" shall return the same value.
As an extension, the shell may recognize arithmetic expressions beyond those listed. The shell may use a signed integer type with a rank larger than the rank of signed long. The shell may use a real-floating type instead of signed long as long as it does not affect the results in cases where there is no overflow. If the expression is invalid, or the contents of a shell variable used in the expression are not recognized by the shell, the expansion fails and the shell shall write a diagnostic message to standard error indicating the failure.

2.7.7 Open File Descriptors for Reading and Writing

The redirection operator:
[n]<>word
shall cause the file whose name is the expansion of word to be opened for both reading and writing on the file descriptor denoted by n, or standard input if n is not specified. If the file does not exist, it shall be created.

2.5.1 Positional Parameters (stub?)

  • Third step of post lexer processing
  • Need to understand how relates to 21/42sh, how extensively needs to be implemented
A positional parameter is a parameter denoted by the decimal value represented by one or
more digits, other than the single digit 0. The digits denoting the positional parameters shall
always be interpreted as a decimal value, even if there is a leading zero. When a positional
parameter with more than one digit is specified, the application shall enclose the digits in
braces (see Parameter Expansion). Positional parameters are initially assigned when the shell is invoked (see sh), temporarily replaced when a shell function is invoked (see Function Definition Command), and can be reassigned with the set special built-in command.

Implement prefix processing

Prefix tokens should be recognized and parsed correctly, need to assure assignment words are correctly lexed and managed

Manage Bang sequences

Need to understand how bang operators ! are used within the shell and if there are multiple types. Need to decide whether any such operators need to be implemented, what issues to open, and where they will be performing their operations

Properly lex & parse subshells

when lexing, opening ( need to be identified & closing ) identified (should refactor rule 4 to inlcude). need to ensure ( and ) are properly parsed in syntactic analysis so that correct AST is generated.

Refactor redirections

Need to examine redirection logic to see if any functionality can be removed and duplicate logic removed in favor of a common function call

Remove tab parse in lexer

source: src/lexer/lexer.c/lex_switch
Tabs cannot be passed as part of input in terminals, only on test strings. Will be removed as part of ifs

Simplify heredoc

Cannot truncate pipe (to use backspace) in so using variable sized buffer to hold input. Instead should use standard write(with vmin = vtime = 0) and allow terminal line discipline to manage backspace. Read in buffer at a time, map buffer characters through here_end dfa. If not in a state, write to output buffer, write output buffer to file when full. If state detected, write existing buffer and read in until either here_end detected and break or state exits and write buffer, continue as normal. Should use two global variables to manage the sending of signals, only need a given read line to exactly match here_end

refactor quote management

  • single quotes ' ignore all special characters within
  • double quotes ' recognize $(, ${, $(( and ` within
  • backticks ` ignore all special characters within
  • param expansion ${ recognize ', ", $(, ${, $(( and ` within
  • command substitution $( recognize ', ", $(, ${, $(( and ` within
  • arithmetic expansion $(( recognizes ${, $(, `
    idea:
    single format for managing characters:
int manage_single_quote(char **str, int start, int *end, int (*f)(char **str, int start, int end)
{
     // iter string
     // if end not found, return nil
     // else, return (*f)
}

Verify redirect output

Need to investigate noclobber and whether that should be implemented, and verify redirect works as intended

The two general formats for redirecting output are:
[n]>word
[n]>|word
where the optional n represents the file descriptor number. If the number is omitted, the redirection shall refer to standard output (file descriptor 1).

Output redirection using the '>' format shall fail if the noclobber option is set (see the description of set -C) and the file named by the expansion of word exists and is a regular file. Otherwise, redirection using the '>' or ">|" formats shall cause the file whose name results from the expansion of word to be created and opened for output on the designated file descriptor, or standard output if none is specified. If the file does not exist, it shall be created; otherwise, it shall be truncated to be an empty file after being opened.

Implement append redir output

Appended output redirection shall cause the file whose name results from the expansion of word to be opened for output on the designated file descriptor. The file is opened as if the open() function as defined in the System Interfaces volume of POSIX.1-2017 was called with the O_APPEND flag. If the file does not exist, it shall be created.
The general format for appending redirected output is as follows:
[n]>>word
where the optional n represents the file descriptor number. If the number is omitted, the redirection refers to standard output (file descriptor 1).

2.5.2 Special Parameters (stub?)

  • Fourth step of post lexer processing
  • Need to understand the extent to which this needs to be implemented vs stubbed out

Listed below are the special parameters and the values to which they shall expand. Only the values of the special parameters are listed; see wordexp for a detailed summary of all the stages involved in expanding words.

@
Expands to the positional parameters, starting from one, initially producing one field for each positional parameter that is set. When the expansion occurs in a context where field splitting will be performed, any empty fields may be discarded and each of the non-empty fields shall be further split as described in Field Splitting. When the expansion occurs within double-quotes, the behavior is unspecified unless one of the following is true:
Field splitting as described in Field Splitting would be performed if the expansion were not within double-quotes (regardless of whether field splitting would have any effect; for example, if IFS is null).

The double-quotes are within the word of a ${parameter:-word} or a ${parameter:+word} expansion (with or without the ; see Parameter Expansion) which would have been subject to field splitting if parameter had been expanded instead of word.

If one of these conditions is true, the initial fields shall be retained as separate fields, except that if the parameter being expanded was embedded within a word, the first field shall be joined with the beginning part of the original word and the last field shall be joined with the end part of the original word. In all other contexts the results of the expansion are unspecified. If there are no positional parameters, the expansion of '@' shall generate zero fields, even when '@' is within double-quotes; however, if the expansion is embedded within a word which contains one or more other parts that expand to a quoted null string, these null string(s) shall still produce an empty field, except that if the other parts are all within the same double-quotes as the '@', it is unspecified whether the result is zero fields or one empty field.

Expands to the positional parameters, starting from one, initially producing one field for each positional parameter that is set. When the expansion occurs in a context where field splitting will be performed, any empty fields may be discarded and each of the non-empty fields shall be further split as described in Field Splitting. When the expansion occurs in a context where field splitting will not be performed, the initial fields shall be joined to form a single field with the value of each parameter separated by the first character of the IFS variable if IFS contains at least one character, or separated by a if IFS is unset, or with no separation if IFS is set to a null string.

Expands to the decimal number of positional parameters. The command name (parameter 0) shall not be counted in the number given by '#' because it is a special parameter, not a positional parameter.

?
Expands to the decimal exit status of the most recent pipeline (see Pipelines).

(Hyphen.) Expands to the current option flags (the single-letter option names concatenated into a string) as specified on invocation, by the set special built-in command, or implicitly by the shell.
$
Expands to the decimal process ID of the invoked shell. In a subshell (see Shell Execution Environment ), '$' shall expand to the same value as that of the current shell.

!
Expands to the decimal process ID of the most recent background command (see Lists) executed from the current shell. (For example, background commands executed from subshells do not affect the value of "$!" in the current shell environment.) For a pipeline, the process ID is that of the last command in the pipeline.

0
(Zero.) Expands to the name of the shell or shell script. See sh for a detailed description of how this name is derived.
See the description of the IFS variable in Shell Variables.

refactor: separate traversal and execution

Traversal is actually semantic analysis, should traverse ast and construct an execution tree of nodes, where each node is either an operator (where left and right children are operators, pipes or commands), pipes (where children are pipes or commands), or commands (which contain command array & redirection chain). semantic should validate that commands exist, diff between builtins and programs, find full program paths, and validate filenames when needed.

Fix Tilde expansion

As it stands, tilde expansion replaces ~ with value of $HOME. Tilde expansion instead should replace tilde-prefix (string from starting ~ up to first /, or entire string if no / present) with home directory associated with login given by prefix word, or $HOME if none present

2.3.1: Implement Alias Substitution (create stub)

  • Optional in 21sh
  • After lexing a token and before parsing the ast, alias is the first step of post lex processing
After a token has been delimited, but before applying the grammatical rules in Shell
Grammar, a resulting word that is identified to be the command name word of a simple
command shall be examined to determine whether it is an unquoted, valid alias name.
However, reserved words in correct grammatical context shall not be candidates for
alias substitution. A valid alias name (see XBD Alias Name) shall be one that has
been defined by the alias utility and not subsequently undefined using unalias.
Implementations also may provide predefined valid aliases that are in effect when the
shell is invoked. To prevent infinite loops in recursive aliasing, if the shell is
not currently processing an aliasof the same name, the word shall be replaced by
the value of the alias; otherwise, it shall not be replaced.

If the value of the alias replacing the word ends in a <blank>, the shell shall check
the next command word for alias substitution; this process shall continue until a
word is found that is not a valid alias or an alias value does not end in a <blank>.

When used as specified by this volume of POSIX.1-2017, alias definitions shall not be
inherited by separate invocations of the shell or by the utility execution
environments invoked by the shell; see Shell Execution Environment.

source: spec

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.