From: Xah
Subject: text editor feature: extend selection by semantic unit
Date: 
Message-ID: <bb6d1643-cf17-46dd-a15d-6cf1ffed5968@a19g2000pra.googlegroups.com>
here's a article about a text editor feature i recently extended.

I hope its a useful idea to someone.

A Text Editor Feature: Extend Selection By Semantic Unit
http://xahlee.org/emacs/syntax_tree_walk.html

plain text version follows (but you need the CSS highlight to see the
point)

---------------------------------------
A Text Editor Feature: Extend Selection By Semantic Unit

Xah Lee, 2006-09, 2008-10

In this article, i introduce a feature in the Mathematica↗ frontend
(that is, the editor bundled with Mathematica), that could be useful
in any editor and for any language.

In Mathematica frontend, a user can press a key (“Ctrl+.”), and the
token the cursor is on will be selected (highlighted), when user
presses the key again, the selection expands to highlight the next
smallest semantic unit . When the key is pressed again, it extends
further.

Here is a example of Mathematica code with highlights showing its
extend selection behavior, starting at the “n” inside the braces,
extend outwards to cover higher level syntactical unit.

Table[n/(n + 1), {n, 1, 10, 1/2}]

Here's different scenario when the cursor is at different position:

Table[n/(n + 1), {n, 1, 10, 1/2}]

Table[n/(n + 1), {n, 1, 10, 1/2}]

Table[n/(n + 1), {n, 1, 10, 1/2}]

Examples For C-Like Syntax

Here's some examples on a language with C-like syntax (C, C++, C#,
Java, Javascript, and others).

class PrintMe {public main(String[] args) {print("Nice!");}}

class PrintMe {public main(String[] args) {print("Nice!");}}

class PrintMe {public main(String[] args) {print("Nice!");}}

/* print out a string */

/* print out a string */

Nested Syntax Examples

For a language with nested syntax, suppose we have this XML example:

<entry>
  <title>Gulliver's Travels</title>
  <id>tag:xahlee.org,2006-08-21:030437</id>
  <updated>2006-08-20T20:04:41-07:00</updated>
  <summary>Annotated a chapter of Gulliver's Travels</summary>
  <link rel="alternate" href="../p/Gullivers_Travels/gt3ch05.html"/>
</entry>

If the cursor is inside a tag's enclosing content, say, on the letter
T in the string “Gulliver's Travels” inside the <title> tag, then the
repeated extension is obvious. But however, suppose the cursor is at t
in the “alternate” inside the “link” tag, then it would first select
the whole “alternate” word, then expand to the double quotes
“"alternate"”, then the whole property “rel="alternate"”, then the
whole link tag, then the whole content of the entry tag, then
including the “<entry>” tags itself.
Lisp Example

For the lisp, the language syntax is almost a pure nested parentheses
(exceptions are chars such as “;',@|#” that have special syntactical
meanings). Here's some example on how this feature would work in lisp.

(defun insertMe () (interactive) (insert "«»") (backward-char 1))

(defun insertMe () (interactive) (insert "«»") (backward-char 1))

(defun insertMe () (interactive) (insert "«»") (backward-char 1))

(defun insertMe () (interactive) (insert "«»") (backward-char 1))

Note: emacs's lisp mode provides several functions to traverse nested
syntax: backward-sexp, forward-sexp, backward-up-list, down-list,
backward-list, forward-list, mark-sexp. Effectively, it is relatively
trivial to implement the above extend-selection-semantic-unit
function. You just need to call one of the sexp walking function to
move the cursor to the right place, then call mark-sexp.
Summary

In summary, this highlighting feature is a Parse tree↗ walker. Each
invocation will go up one level on the parse tree and select all its
branches.
Emacs Implementation

For the lisp case, it's easy to implement. Here's the code by Nikolaj
Schumacher.

(defun extend-selection (arg &optional incremental)
  "Mark the sexp surrounding point.
Subsequent calls mark higher levels of sexps."
  (interactive (list (prefix-numeric-value current-prefix-arg)
                     (or (and transient-mark-mode mark-active)
                         (eq last-command this-command))))
  (if incremental
      (progn
        (up-list (- arg))
        (forward-sexp)
        (mark-sexp -1))
    (if (> arg 1)
        (my-mark-sexp (1- arg) t)
      (re-search-forward "\\_>")
      (mark-sexp -1))))

For the case of langs with C like syntax, a practical solution that
works 99% of the time should be easy. Basically the command will do:

    * 1 press → select current word.
    * 2nd press → current string if inside double quotes.
    * press again → current line.
    * press again → current content inside {}.
    * press again → next outer braces.
    * press again → whole function def.

Elisp system has many functions that already understand each of these
syntactical units. It's not difficult to put the whole together.

For the XML case, with its regular nested syntax of start/end tags
where the start tag may contain tokens in sequence, one may need a bit
more work than the lisp case, but there's already a full xml parser in
the nxml mode.

  Xah
∑ http://xahlee.org/

☄
From: Xah
Subject: Re: text editor feature: extend selection by semantic unit
Date: 
Message-ID: <846c0ccf-a18d-4340-a930-a95c72e72118@w39g2000prb.googlegroups.com>
Xah wrote:
> here's a article about a text editor feature i recently extended.

> I hope its a useful idea to someone.

> A Text Editor Feature: Extend Selection By Semantic Unit
> http://xahlee.org/emacs/syntax_tree_walk.html

> plain text version follows (but you need the CSS highlight to see the
> point)

Wei Weng <·····@acedsl.com> wrote:
> hi xah,
>
> I tested this with my .emacs file on the line:
>
> (global-set-key (kbd "C-c n") 'extend-selection)
>
> 1st press -> select current word PASS
> 2nd press ? current string if inside double quotes.
> press again ? current line. PASS
>
> Then if I press C-c n again, it starts giving me error:
>
> up-list: Scan error: "Unbalanced parentheses", 17996, 1 [3 times]
>
> My semantic version
>
> ECB 2.32 uses loaded semantic 2.0pre4, eieio 1.0 and speedbar 1.0.1.

Hi Wei,

Nikolaj's code only works in lisp mode. In particular, it does not
work in lang modes with c syntax. However, it works quite well if you
just use it to select the current word in any mode.

The semanics package of ECB is not required for his code to work.

Also, he's got a better version that will extend selection to current
string if there's one.

Here's the updated code:

; by Nikolaj Schumacher, 2008-10-20. Licensed under GPL.

(defun semnav-up (arg)
  (interactive "p")
  (when (nth 3 (syntax-ppss))
    (if (> arg 0)
        (progn
          (skip-syntax-forward "^\"")
          (goto-char (1+ (point)))
          (decf arg))
      (skip-syntax-backward "^\"")
      (goto-char (1- (point)))
      (incf arg)))
  (up-list arg))

(defun extend-selection (arg &optional incremental)
  "Mark the symbol surrounding point.
Subsequent calls mark higher levels of sexps."
  (interactive (list (prefix-numeric-value current-prefix-arg)
                     (or (and transient-mark-mode mark-active)
                         (eq last-command this-command))))
  (if incremental
      (progn
        (semnav-up (- arg))
        (forward-sexp)
        (mark-sexp -1))
    (if (> arg 1)
        (extend-selection (1- arg) t)
      (if (looking-at "\\=\\(\\s_\\|\\sw\\)*\\_>")
          (goto-char (match-end 0))
        (unless (memq (char-before) '(?\) ?\"))
          (forward-sexp)))
      (mark-sexp -1))))

This new code is updated on my page here:
http://xahlee.org/emacs/syntax_tree_walk.html

It is now also part of my ergonomic keybinding set, bound to key Alt
+8.
(see
http://xahlee.org/emacs/ergonomic_emacs_keybinding.html )

He intends to work more on this and perhaps form a package.
I also will extend it to cover lang modes with c-like syntax, perhaps
with a draft version by end of this year.

he's also soliciting ideas and feedback. Thanks.

  Xah
∑ http://xahlee.org/

☄