See what "Markup Language" is in other dictionaries. Document markup languages ​​markup language What is the role of markup language

You might sometimes hear about "programming" a Web page, but formatting text in a browser window is not actually programming. As the names of the formatting languages ​​suggest, they refer to markup languages. In other words, they are made up of various characters inserted into the body of the document to indicate how the file should look when printed or displayed, or to define its logical structure (for example, paragraphs and bulleted lists). If you do not use a markup language, the displayed data is raw text, with no character or paragraph formatting.
Markup languages ​​define the appearance of a document using codes called descriptors or tags, which take the form of: ... The first descriptor indicates the starting point for formatting, and the second (with a forward slash) indicates the end of the code. If you omit the second descriptor, the encoding specified by the first descriptor is applied to the end of the document.

Note:
The markup language can be applied to unstructured text either manually, with a text editor (such as Notepad), or with a graphical tool that adds code when you visually arrange the text to your liking.
It is easier for beginners to work with graphical tools that are not as accurate as text editors.

Hypertext Markup Language (HTML)

HTML (HyperText Markup Language) is the backbone of coding and the backbone of most Web pages. HTML allows you to publish text and graphics, spreadsheet content, and even create reports from databases for interactive reading. It is great for organizing and formatting static information of any type, as it allows you to:

● set the size and font of the text;
● format the text in bold, italic or underline;
● set links to other pages;
● insert images;
● create page titles;
● create tables;
● embed the metadata necessary for the search engines to work.

Note:
Metadata refers to hidden data that does not appear on a Web page, but can be detected by an engine search, which will allow you to get to a given site.

Three types of HTML descriptors are used.

● For formatting text or individual characters.
● For formatting paragraphs or other large text blocks.
● Invisible descriptors that provide other functionality, such as metadata, to perform searches.

The main advantage of HTML over other markup languages ​​is its tremendous versatility. The current version of HTML is supported by almost any browser (modern and graphical, of course). This is not always true for dynamic HTML (DHML), XML, Java, and ActiveX. If you want your Web sites to be accessible to all kinds of browsers, we recommend using HTML.

Dynamic HTML (DHML)

Dynamic HTML (Dynamic HTML - DHML) is more flexible than HTML.
Instead of exposing a static Web page to the public, you can use DHTML and create a Web page that a user can customize without disrupting the look of the original document. For example, a page rendered with DHTML might contain various elements that the user can move around the page to rearrange its content (to their liking). However, when you refresh (refresh) the images on the page, the changes disappear and it returns to its original form.
DHTML supports the following features that are not available in HTML.

● Dynamic styles.
● Precise positioning.
● Data binding.
● Dynamic content.

Do you not understand what it is? Don't worry - below are explanations.

Applying Styles to Web Documents. Dynamic styles are based on the principles of cascading style sheets (CSS) when they are applied to the page as a whole, instead of manually formatting individual parts of the page.
If you have worked with modern word processors, you may be familiar with style sheets, which allow you to automatically format text blocks in one way or another, depending on the style you give them. Formatting means changing the color of text, font, placement, visibility - in general, almost everything that relates to the properties of the text. CSS (and DHTML) does the same thing, only it applies to rendering Web pages, not text.
Dynamic styles implemented with DHTML provide features not found in word processors. For example, when creating links, you can mark up text so that its color will automatically change when you hover over it with the mouse pointer, or displayed when you hover over a specific area of ​​the screen.
The only drawback with these styles is that you must include style sheets in most documents. This is a time-consuming job, especially for those inexperienced with stylesheets or converting documents.

Placing the text in the right place. Another great thing about DHML is its ability to pinpoint the exact placement of an element on a page. Horizontal (x), vertical (y), and even volumetric (z) coordinates are used to indicate the position of an object. (Positioning an object in a 3D coordinate system allows you to "overlap" objects.) Precise positioning allows you to position text around an image, as well as move objects within the browser window.

Note:
HTML without CSS does not provide precise placement of objects. In this case, the placement of the elements is determined by the browser.

Inserting data into the page. To give users access to some internal information (back-end information), for example, stored in a database, normal HTML pages must be linked to the server where the original data is located, and it is required to request permission to manipulate this data. DHTML allows you to bind data to a specific page, allowing you to work with bound (or, more precisely, bound) data without violating the original data and even without interacting with the server that stores it. To do this, data sources are entered into the page (they can be sorted and filtered in the same way as the contents of any database). This not only reduces server load, but also allows users to view and manipulate data without giving them access to the source of the data itself.

Creating dynamic content. Style sheets enable the Web publisher to easily change the appearance of a page or set of pages.
Dynamic content allows the Web user to change the look and feel of a page by script execution to:

● insert or hide page elements;
● modify the text;
● change the structure of the text;
● move data from internal sources (back-end sources) and display it at the user's request.

Unlike HTML, which allows you to change the content of a page only before it is loaded into the user's browser, DHTML can perceive changes at any time.
Dynamic content provides the ability to provide a high level of interactivity when used in conjunction with scripts that allow users to determine which items to view.

Advice:
In the "Organizing Meetings" section (above), there is a web-based building map showing the location of a particular office and a portrait of the employee that the user is trying to locate. This map was created using DHTML markup tools to create dynamic content.

Extensible Markup Language (XML)

The Extensible Markup Language (XML) does not replace HTML (at least rarely on Web pages), but it does support it, allowing for some more versatility in Web pages.
The idea is that when you format a page with HTML, you can change the appearance of the text with descriptors that format it in bold, italic, underline, paragraphs, etc. However, the descriptors themselves have practically nothing to do with the content of the text, but only with its formatting. XML language has descriptors that determine the appearance of text. You can use them to specify what the text stands for (names, addresses, product names, etc.).
Why is this needed? First of all, this metadata allows search engines to find predefined items. If you search your corporation's Web site (created using HTML) for the word "name", looking at all the names it contains, all instances of the word "name" are returned, but not the names themselves. However, if XML encoding was used when creating the node, any text that has the "name" descriptor will be returned as a result. Second, descriptor-provided portions of text can be useful if you need to apply a medium (such as color or language) only to portions of a Web document. For example, suppose an interactive document is a short story in Spanish with an English translation. Then, instead of switching the document from Spanish support to English support, you can define these parts of the story with descriptors and apply the rules of the Spanish language only to these parts, and leave the translations in English.
As such, XML makes it much easier to develop a Web page, especially if parts of it need to be created as isolated elements.

Can I reduce the current drawn from a computer battery?
The radio transmitter converts electrical energy into radio signals, so the battery in a laptop with a wireless adapter installed will drain much faster than ...

Interaction of applications with networks
In some cases, applications run differently on networked computers than on stand-alone computers. Some applications work on the network in the same way as on a stand-alone computer, while others, on the contrary, require ...

Terms starting with the letter V
Virtual Machines are virtual machines. Software that simulates the operation of a physical device. In Windows 98, it is used to "trick" a program into ...

(Standard Generalized Markup Language), represented in the ISO 8879 standard. This language is adopted as the main language for technical documentation, including interactive electronic technical manuals for products created in CALS technologies.

SGML defines the structure of documents as a sequence of data objects. Data objects representing parts of a document can be stored in various files. The SGML standard establishes such sets of symbols and rules for the presentation of information that allow different systems to correctly recognize and identify this information. The named sets are described in a separate part of the document called the Document Type Decfinition (DTD), which is passed along with the main SGML document. The DTD specifies the correspondence between characters and their codes, the maximum lengths of identifiers used, how the delimiters for tags are represented, other possible conventions, the DTD syntax, and the type and version of the document. Hence SGML can be called a metalanguage for a family of specific markup languages. In particular, XML and HTML markup languages ​​can be considered subsets of SGML.

The SGML datasheet includes:

  • main file with technical manual, marked up with SGML tags;
  • description of entities, if the document belongs to a group in which the same entities are used and their fame is implied;
  • a dictionary for explaining SGML tags;

However, SGML is difficult to learn and use. Therefore, for the widespread use of markup in documents presented in WWW-technologies, in 1991, based on SGML, a simplified HTML (HyperText Markup Language) was developed, and in 1996, XML (eXtensible Markup Language), which becomes, in combination with HTML is the main language for presenting documents in various applications.

The HTML language was developed with the aim of widespread use of markup in documents presented in WWW technologies.

An HTML description is an ASCII text and a sequence of commands (control codes) included in it, also called descriptors or tags. This text is called an HTML document, or HTML page, or when posted on a Web server, a Web page. Tags are placed in the right places in the source text, they define fonts, hyphenation, the appearance of graphics, links, etc. When using WWW-editors, insertion of commands is carried out by simply pressing the corresponding keys.

XML, like HTML, is considered a subset of SGML. Currently, XML claims to be the main language for representing documents in information technology; it can be considered as a metalanguage that serves as the basis for creating private markup languages ​​in various applications. At the same time, XML is more convenient than SGML, which is provided by the elimination of some minor SGML features in XML. XML descriptions are easier to read, adapted for use in modern browsers while retaining the basic capabilities of SGML.

For specific applications, there are variations of XML called XML vocabularies or XML applications. For example, an XML application OSD (Open Software Description) has been developed to describe texts with specific mathematical symbols. CALS is interested in the Product Definition eXchange (PDX) data exchange option. Known dictionaries for chemistry (CML - Chemical Markup Language), biology (BSML - Bioinformatic Sequence Markup Language), etc.

We have released a new book, Content Marketing on Social Media: How to Get Into the Heads of Subscribers and Fall in Love with Your Brand.

HTML is a Hypertext Markup Language.

The language is used to organize web pages. Let's make an analogy. You buy a newspaper. It contains several articles. Each article has a title, it has photographs. And the text is typed in several columns. This is the structure of the newspaper page.

Everything happens the same on the site. To make the correct structure of the article - content - you need to use a text markup language.

What is HTML for

HTML is used to tell the browser how to display the page on the screen.

The language is ubiquitous. It is a versatile tool for styling content on a page. Its use is possible in any browser. If you write code in a programming language, you need to know some peculiarities, operators, data types, and so on.

HTML consists of a set of tags - commands, and attributes - properties. They are easy to remember and reference materials can always be found.

What is HTML Code

Code is commands to the browser how to display the page. There is a structure that must always be followed. For example, there is only one H1 heading on the page, the main information is placed in a section, etc.

There are three instruments in the language.

There are two types of tags - paired and single.

  • - paired tag, opening and closing. They act on the text placed between them.
  • Single tag, it acts on the text after it until the next tag.

The structure of the HTML code on the page

We said that the structure of any html document is always the same. Next, we list the required elements.

  1. !- indicates that HTML is used in the document.
  2. ...- all page code is placed in this tag. Anything that is not placed in it is not recognized by the browser and is not displayed.
  3. ...- a paired tag, it contains technical information, for example, about the encoding of the document.
    1. ... is the title of the page, it is placed inside the head section. Any page must have its own unique title.
    2. - this is service information. It connects separate styles to the page - css, etc. It is not displayed to the user.
  4. ...- the body of the page. All basic information is contained in this tag.
    1. ...- hyperlinks.
    2. - Images.
    3. ...- thumbnail.
    4. ...- italics.

There can be an unlimited number of elements inside the body.

For example, this is what part of the page code looks like in one of our blog posts.

The more often you use tags, the faster they are remembered. You can always find a reference with all tags, attributes and their values.

Any document has three components:

· Structure;

Content is the information that is displayed in the document. The content of a document on paper can be purely textual, and also contain images. If the document is presented in electronic form, it may contain multimedia data, as well as links to other documents. Although the content of different documents is different, they can be classified by type, for example, a book or a train ticket.

The style of the document determines the way its content is displayed on a particular device (for example, a printer or a display). The concept of style includes the characteristics of the font (name, size, color) of the entire output document or its individual blocks, the order of pagination, the arrangement of blocks on the pages and other parameters. The same document can be output in different styles both on different media and on the same media.

Document markup languages ​​are artificial languages ​​designed to describe the structure of a document and the relationships between various structure objects. The markup data is also called metadata.

The first markup language is the Generalized Markup Language (GML), developed by IBM employees back in the 60s of the last century. Its immediate successor was the Standard Generalized Markup Language (SGML), which defines the rules for writing markup elements in a document. A document that conforms to the rules of the language is called an SGML document.

SGML is defined in the ISO 8879 standard, which specifies the following basic requirements for a document markup language:

· The language must be human readable.

· Marked-up document files must be text and encoded using ASCII (American Standard Code for Information Interchange) characters. However, the content of the document does not have to be ASCII encoded or text.

SGML and similar languages ​​use special document markup tools:

· Elements and accompanying attributes;

Entities;

· Comments.

The structural unit of an SGML document is an element. In markup text, each element must be highlighted in a specific way. Highlighting is done by inserting a start tag (from the English word tag - tag) at the beginning of the element (start tag) and the end tag (end tag) at the end of the element. The start and end tags have the same name. To distinguish tags from regular text, they must begin with a start tag and end with a tag end character. In addition, a character is specified in the end tag - the sign of the end tag. In SGML, any characters can be used as such tags, but most often the character "<" (левая угловая скобка), в качестве признака окончания тега используется символ ">"(left angle bracket) and" / "(slash) as the end tag. Elements in an SGML document can contain other elements, resulting in a graphical representation of an SGML document as a hierarchical (tree) structure.


Example 4.3.1. An SGML document defining a list of students with the results of their exam session can be specified as follows:

List of student grades in session

Ivanov Ivan Ivanovich

TS-61

A

B

B

B

Petrov Petr Petrovich

TS-62

C

C

D

C

In this document, the first element is the student-list element. This element contains one title element (title) and several student elements (data about the student). In turn, each student element contains one full-name element (last name, first name and patronymic of the student), one group-number element (group number), and one mark-list element (list of student's grades in the session). Finally, the mark-list element contains several mark (mark) elements.

A graphical representation of this list in Fig. 1 has a tree structure:

Fig. 4.3.1. Plotted SGML Document Structure

Attributes can be used to refine SGML elements. Attributes are written in the start tag of an element as follows:

attribute-name = "attribute-value".

An element can have multiple attributes. Attributes are separated from each other and the element name by at least one space.

Example 4.3.2. For mark elements in example 4.3.1, you can set the subject attribute, the value of which is the name of the discipline for which the exam was passed. Then, for the first student, the elements will take the following form:

A

B

B

B

Languages ​​such as SGML use entities to work with data groups. An entity is any named data, both textual and non-textual. When viewing the document, the name of the entity is replaced with its value. So, for example, the name of the text entity kpi will be replaced with its value: Kiev Polytechnic Institute, and the non-text entity image1 will be replaced with an image named image1.

The term "markup" derives from the traditional practice of marking up manuscripts before publication (that is, adding symbolic commands in the margins and between lines in a paper manuscript).

For many centuries, this was done by publishers (editors and proofreaders) who noted which font, style and size should be typed fragments of text, and then passed the manuscript to typesetters, who manually typed the text taking into account the markup characters.

Markup language is a set of special instructions called tags that perform the following functions:

      setting functions for processing selected elements;

      selection of logical elements of this document.

Setting functions for processing selected items

In word processors, there are built-in commands for turning on / off fonts and others, similar to commands for controlling the placement of information on the screen or when printing. This approach is called command or procedural markup.

Examples of procedural markup

Selection of logical elements of the document

Serves for the formation in documents of any structure and definition of relationships between various elements of this structure without specifying the processing method. This markup is called descriptive.

By changing the set of procedures to match the descriptive markup, you can change the appearance of the same document.

Descriptive markup

The main advantage of descriptive markup is its flexibility, since the pieces of text are marked as “what they are” (rather than “how they should be displayed”).

In the future, software may be written to process these fragments in a way that was not even foreseen by the language developers. For example, HTML hyperlinks, originally intended for users to navigate through a set of links in the network, later began to be used by search and indexing mechanisms on the web, to assess the popularity of resources, and so on.

Examples of markup languages

Markup languages ​​are used wherever rich text output is required:

    in printing house (SGML, TeX, PostScript, PDF),

    user interfaces of computers (MicrosoftWord, OpenOffice, troff),

    World Wide Web (HTML, XHTML, XML, WML, VML, PGML, SVG, XBRL).

Markup language tag structure

The development of descriptive markup ideas led to the definition of markup as a formal language.

Language tags (control descriptors) are encoded in a certain way (allocated relative to the main content of the document) and serve as instructions for the program that displays the content of the document on the client side.

Many modern languages ​​have used the symbols for these commands (language tags)< и >, inside which the names of instructions and their parameters (HTML and XML) were placed. In SGML, you can assign other characters to enclose the tag (for example, curly braces). In addition, there are different systems of subset languages ​​that are used, with fewer capabilities, for example, BBCode markup language is used in web forums and message boards, the tags of which are delimited by square brackets:.

The tag model describes a document as a collection of containers, each of which begins and ends with tags. In most cases, tags are used in pairs. The pair consists of a start tag and an end tag.

Opening tag syntax:<имя_тега [атрибуты]>

The name of the closing tag differs from the name of the opening tag only in that it is preceded by a forward slash:

Attributes define additional characteristics of an element. Tag attributes are written in the following format: name [= "value"]. For some attributes, a value may not be specified. The end tag has no attributes.

Any paired tag starts with a start tag and ends when a matching end tag is encountered.

A pair of opening and closing tags is called a container, and the part of the text between them is called an element.

Level 1 heading

Level 2 heading

Depending on the markup language used, it is additionally possible to use a single tag and an empty element tag. The tag name defines the type of the element.

Single tag syntax:<имя_тега [атрибуты] />

In some markup languages, tag names are predefined (HTML). Others are not strictly regulated, i.e. users can enter and use new tags (XML). For example, the tag "persona" can define the type of this XML element, such as last name, first name and patronymic. Ivanov Ivan Ivanovich

In SGML, elements can overlap, that is, in SGML, the following sequence of tags is possible:

In XML, elements have a strict syntactic structure, that is, they are strictly nested and always closed:

Also, in SGML, HTML, they don't have to be private:

In almost all document markup languages, the attribute value is interpreted as text. The attribute value is usually enclosed in quotation marks.

Note:

A document written using a markup language contains not only the text itself (like a sequence of words and punctuation marks), but also additional information about its various parts - for example, an indication of headings, highlights, lists, etc.

Those. the document is nothing more than a regular ASCII file with control codes (tags) added to it.